Henry VIII and The Two Noble Kinsmen
from First Principles

A comprehensive authorship attribution study for Shakespeare scholars

March 2026 · Codex research workspace · Database: 527 early modern plays · Target: Henry VIII (First Folio, play ID 502)

This report presents the full findings of a first-principles computational authorship study of Henry VIII. It expands substantially on the accompanying PDF, explaining every methodological decision in plain prose, comparing results against recent scholarship, and providing the complete chunk-level data. The central finding: the broad traditional Fletcher scene map is too coarse. The play has a strong Shakespearean backbone with a smaller number of locally persistent Fletcherward pockets — narrower and more fragile than scene-level tradition implies.

1. The Problem

Henry VIII — or All Is True, as the play was known in its own time — is one of the longest-running authorship debates in Shakespeare studies. The play was printed in the First Folio of 1623 without attribution to any collaborator. It was accepted as entirely Shakespeare's until 1850, when James Spedding argued from internal verse evidence that substantial portions must be by a different hand. That hand, he proposed, was John Fletcher.

Spedding's argument has proven remarkably durable. It was reinforced by Cyrus Hoy's linguistic-marker work in the early 1960s, extended by Jonathan Hope's sociolinguistic analysis in the 1990s, and forcefully advocated by Brian Vickers in 2002. More recent computational work — especially Petr Plecháč's rolling attribution study published in Digital Scholarship in the Humanities in 2021 — broadly supports collaboration while introducing some refinements to the scene-level map.

Yet the question has also never been fully settled. A determined minority of scholars has maintained Shakespeare's sole authorship. Editors have generally been cautious about exact internal boundaries. And the methods used to argue for collaboration have shifted substantially over time: from verse metrics, to linguistic markers, to function-word profiles, to rolling computational attribution.

This project does not try to settle the full history of the question in one stroke. It does something more focused: it begins from the text itself rather than from a previously published map, it validates every method before interpreting results, and it asks what survives when suspicious scenes are divided into their local dramatic sub-units.

Why "from first principles"? The key methodological commitment of this project is that no previously published Henry VIII scene map is allowed to function as the answer key. Scholarship is used as a comparison layer and benchmark — not as a training label. This means that the traditional Spedding/Hoy divisions and the more recent Plecháč rolling map are treated as hypotheses to test, not ground truth to recover.

Why this is a hard problem

Four features of Henry VIII make the authorship question genuinely difficult, independent of any particular method:

No external documentary evidence. There is no contemporary document directly naming Fletcher as co-author. The collaboration hypothesis rests entirely on internal evidence — style, vocabulary, verse patterns, and linguistic habit.
The play contains multiple dramatic registers. Ceremonial pageantry, legal accusation, comic crowd scenes, intimate prophecy, and political counsel all appear in the same play. Different dramatic registers can themselves generate different stylometric profiles, independent of authorship.
The First Folio text may have been mediated. Compositorial practice, scribal transcription, and editorial normalization can affect the very features — spelling, contraction, function-word choice — that authorship methods depend on.
Methods disagree on units of attribution. Some scholars work at the scene level. Others work at the rolling-window level. Others work scene-by-scene. The scale at which you measure can determine what you find.

2. The History of Scholarship

Understanding the current project requires understanding what the existing scholarship has and has not established. The following is an expanded review of the major positions, with particular attention to their methodologies and their limitations.

James Spedding

1850 · "Who Wrote Shakespeare's Henry VIII?"

The historical starting point. Spedding argued from verse endings — particularly feminine endings — that large portions of the play bore a different stylistic character from Shakespeare's late verse. His proposed Fletcher territory included scenes 1.3, 1.4, 2.1, 2.2, 3.1, 4.1, 4.2, 5.3, and 5.4. This became the "canonical" map that most subsequent scholars have either defended or modified. Importantly, Spedding's method was already lineation-sensitive: he noticed that verse structure (not just vocabulary) differed. The current project's verse-and-rhyme analysis is a direct methodological descendant of this instinct.

Cyrus Hoy

1962 · Studies in Bibliography

Hoy's contribution was to shift the evidence from verse metrics to linguistic markers. He documented Fletcher's distinctive preferences — ye for you, 'em for them, has for hath, does for doth — across a large corpus of Fletcher's secure work. He found that these markers clustered in the Spedding-attributed scenes of Henry VIII, broadly confirming the scene map. The current project tests these markers carefully: several validate strongly on the secure reference corpus, but some (ye especially) are complicated by transmission and dramatic mode.

Jonathan Hope

1994 · The Authorship of Shakespeare's Plays

Hope introduced a sociolinguistic approach, focusing on second-person pronoun usage (the T/V distinction: thou/thee vs. you/your) as a marker of social register and authorial habit. His results supported collaboration in a general sense, though with important caveats about the relationship between sociolinguistic habit and dramatic mode. The current project tests the T/V layer independently; it validates at a moderate level (0.727 overall accuracy) but is not strong enough to adjudicate the full map alone.

Brian Vickers

2002 · Shakespeare, Co-Author

Vickers's monograph offered the most sustained advocacy for the collaboration hypothesis across multiple plays, including Henry VIII. He treated several scenes as securely Fletcherian and challenged critics who resisted the attribution. The current project is more cautious about the same scenes: 5.2 now looks largely Shakespearean under chunk-level analysis, and 3.1 appears unresolved rather than decisively Fletcherian.

Petr Plecháč

2021 · Digital Scholarship in the Humanities

The most important recent computational study. Plecháč combined most-frequent-word analysis with most-frequent-rhythmic-pattern analysis, using rolling attribution to assign authorship probabilities across the play without being locked to scene boundaries. His results broadly support Spedding while adding modifications that partially align with earlier Merriam-style analyses. He found that authorship shifts do not always respect scene boundaries — a key finding that this project's manual chunking approach also supports. However, his rolling trace is considerably more Fletcherward than the current project's local reproduction with a validated reference corpus.

Craig & Kinney

2009 · Shakespeare, Computers, and the Mystery of Authorship

Craig and Kinney's function-word tests of six Henry VIII scenes found that Fletcher wrote 3.1 and 5.2, while Shakespeare wrote the other four tested scenes. Their results partially diverge from the full Spedding map, and subsequent scholars have criticized their Zeta methodology. The current project independently recovers a conservative reading: 3.1 appears mixed rather than decisively Fletcherian, and 5.2 appears largely Shakespearean in three of four sub-chunks.

Will Sharpe

2016 (NOS edition) · 2023 (Shakespeare & Collaborative Writing)

Sharpe edited Henry VIII for The New Oxford Shakespeare and has continued to work on collaborative writing methodology. His 2023 Oxford book frames collaborative authorship in terms of theatre history, book history, and attribution evidence together. The NOS edition takes a cautious position on exact boundaries while affirming collaboration. This editorial caution is strongly aligned with the main finding of the present project.

Folger / Cambridge Editors

Various · Editorial tradition

Both the Folger and Cambridge editorial traditions treat collaboration as highly plausible but refrain from drawing firm scene-by-scene boundaries. The Folger appendix notes that the collaboration hypothesis rests on internal evidence without documentary confirmation. This position — collaboration is likely, exact boundaries remain unstettled — is precisely where the present project also lands.

Recent Literature (2020–2026)

Beyond the core texts above, the following recent scholarship is relevant to the present project:

Plecháč (2021, DSH): The published version of the arXiv preprint (1911.05652). Full citation: "Relative contributions of Shakespeare and Fletcher in Henry VIII: An analysis based on most frequent words and most frequent rhythmic patterns." Digital Scholarship in the Humanities 36.2 (2021): 430–438. This is the most directly comparable computational study. The current project's rolling attribution result is considerably more conservative than Plecháč's — the key difference being the reference corpus and feature-family choice.
Plecháč replication data (GitHub: versotym/henry8): Plecháč made all data and code publicly available. The present project did not use his code, but the availability allows direct methodological comparison. His method combines vocabulary frequency and verse rhythm; the current project uses character trigrams, function words, and linguistic markers, then adds CWFP as a content-word layer.
Sharpe (2023, Shakespeare & Collaborative Writing, OUP): A systematic methodological and historical account of Shakespeare's collaborative practices. Includes Henry VIII and The Two Noble Kinsmen as central cases. Key contribution: framing collaboration attribution as a problem of convergence across evidence types rather than any single test — exactly the multi-method approach this project uses.
Stanford Global Shakespeare Encyclopedia entry on Henry VIII: A useful recent summary of the scholarly consensus, noting that while most scholars now accept collaboration, "scholars using different methods have attempted to discriminate between Shakespeare's and Fletcher's parts without arriving at consensus."
Analytics India Magazine (2019–2020) / TechXplore coverage of ML approaches: Popular-science coverage of machine-learning attribution claiming Fletcher wrote roughly half the play. These reports often overstate the certainty of rolling attribution results. The present project's validation-first design is partly a response to overconfident ML claims in this space.
2021 Stylometric reanalysis (Craig/Kinney critique): Subsequent scholarship has identified methodological problems with the Zeta test as implemented in the 2009 Craig/Kinney volume. This critique is relevant because it shows how sensitive attribution conclusions can be to feature selection and distance metric — a core concern of the present validation framework.

The key pattern in recent scholarship The recent literature confirms a consistent pattern: (1) most computational studies support collaboration; (2) they disagree significantly on the extent and exact location of Fletcher material; (3) the disagreement often traces back to different reference corpora, feature families, and units of attribution. The present project addresses all three sources of variation explicitly.

3. Data, Texts, and the Reference Corpus

The database

All analysis rests on a local SQLite database of 527 early modern English plays. The database stores spoken-token sequences, lemmas (via MorphAdorner), speaker information, act/scene structure, and bibliographic metadata. Lemmatization allows comparison of lexical habits independently of surface spelling variation, which matters significantly for a play like Henry VIII where compositor practice may have standardized or altered original spelling.

The target play is Henry VIII (database play ID 502, 36th play in the First Folio subset). A diplomatic Folger transcription was separately consulted for the original-spelling investigation.

The secure reference corpus

Author-comparison analysis uses a deliberately narrow set of secure reference plays. These are plays whose single authorship is not seriously disputed and which provide enough text for reliable window-level comparison. The corpus is not claimed to represent the entirety of either author's output — it represents plays secure enough to use as a calibration standard.

Author	Play	Tokens
Fletcher	Bonduca	20,226
Fletcher	The Faithful Shepherdess	19,962
Fletcher	The Loyal Subject	25,579
Fletcher	The Mad Lover	18,378
Fletcher	The Woman's Prize	22,858
Fletcher	Valentinian	24,552
Shakespeare	Antony and Cleopatra	23,855
Shakespeare	As You Like It	21,442
Shakespeare	Coriolanus	26,675
Shakespeare	Hamlet	29,897
Shakespeare	Henry IV, Part 1	24,091
Shakespeare	Henry IV, Part 2	25,899
Shakespeare	Henry V	25,702
Shakespeare	Julius Caesar	19,131
Shakespeare	King Lear	25,376
Shakespeare	Macbeth	16,547
Shakespeare	Measure for Measure	21,346
Shakespeare	Othello	25,967
Shakespeare	Richard II	21,864
Shakespeare	The Tempest	16,159
Shakespeare	Twelfth Night	19,548
Shakespeare	Much Ado About Nothing	21,442

The corpus has 16 Shakespeare plays and 6 Fletcher plays. The asymmetry matters: the method will be more comfortable recognizing Shakespeare than Fletcher, and this is confirmed in validation. All conclusions are read against this known asymmetry.

Six governing assumptions

No previously published Henry VIII section map functions as the answer key.
Large scenes are not presumed homogeneous. If a scene changes local dramatic function, it may need manual subdivision.
No method is interpreted before being validated on secure single-author material.
No single feature family decides the case alone.
TNK can be used as a definitely mixed Shakespeare–Fletcher play, but not as a secure internal partition map.
Any chunk that weakens substantially under finer subdivision is treated cautiously, even if it looked persuasive at coarser scale.

4. How Henry VIII Was Manually Chunked

Why scene level was not enough

The project did not begin with the 46-chunk map. It began with scene-level exploration. That first pass immediately revealed a problem: many of Henry VIII's scenes are too long and too internally varied to be treated as single authorial units. A single scene in this play may contain a ceremonial opening, a legal accusation, a private emotional reaction, a packet exchange, and a formal conclusion — five dramatically distinct phases within one scene boundary. Attributing that kind of scene to one author is not just potentially wrong; it is not even a well-formed question.

The methodological response was manual chunking: inserting boundaries only where the dramatic action itself makes a local break intelligible, in the way a literary critic would notice a change even without any stylometric agenda.

Principles of chunking

A new chunk boundary was drawn when one or more of the following occurred:

A speaker bloc changed (e.g., from public noble exchange to private royal reflection)
The dramatic function changed (e.g., from ceremonial prelude to legal accusation)
The local rhetorical mode changed (e.g., from banter to proclamation, or from report to prophecy)
A scene contained an embedded unit that a literary critic would obviously discuss separately — even without any stylometric question

This means the chunk is best understood as a local dramatic unit, not a mechanical token slice. The result was a 46-chunk map across the play's 16 scenes.

Representative chunking examples

Chunk	Scene	Tokens	Label	Boundary rationale
`1.3a`	1.3	575	Court satire and French fashions	Compact comic-expository scene; kept whole because internal mode stays stable
`3.2a`	3.2	607	Noble coalition and anti-Wolsey exposure	Opening noble conference through Wolsey-Cromwell encounter before private alarm
`3.2b`	3.2	679	Packet exchange and Wolsey's inward alarm	Private alarm and preparation for downfall — distinct from the public confrontation
`3.2c`	3.2	123	Wolsey confronted and stripped	Brief formal article-recital block; highly compressed public humiliation
`3.2d`	3.2	935	Wolsey after the fall	Reflective post-fall aftermath — a dramatically distinct emotional register
`5.4a`	5.4	67	Procession naming and opening blessing	Ceremonial prelude before prophecy; mode shifts sharply from 5.3
`5.4b`	5.4	359	Cranmer's first prophecy movement	First prophetic arc; internally stable
`5.4c`	5.4	182	Final prophecy and king's answer	Second arc plus royal response; distinct closure movement
`5.4d`	5.4	123	Epilogue	Non-dramatic; separate unit for all analyses

Figure 1. The 46-chunk map of Henry VIII. Colors indicate the working judgment that emerged from the first-principles ledger: Shakespeare-like (teal), mixed/transitional (amber), Fletcher-like (red), or too short to assess (grey). The strong Shakespearean backbone running through Acts 1–3 is immediately visible, as are the narrow Fletcher-like pockets in 1.3 and 5.3.

Shakespeare-like or leaning chunks

Fletcher-like chunks in final map

Mixed / transitional chunks

5. How The Two Noble Kinsmen Was (and Was Not) Used

Earlier drafts of this project were tempted to use The Two Noble Kinsmen (TNK) as a validation case: if a method can recover the familiar internal Shakespeare/Fletcher partition of TNK, then it can be trusted on Henry VIII. This is a tempting shortcut, but it is methodologically circular.

The standard TNK partition (Acts 1, 2, and 5 mostly Shakespeare; Acts 3 and 4 mostly Fletcher) is itself disputed and internally derived. Using it as a validation target means assuming precisely what you are supposed to test. If your method recovers the disputed TNK map, all you have shown is that your method agrees with a prior disputed conclusion — not that either is correct.

The evidentiary rule adopted here TNK may be treated as a definitely mixed Shakespeare/Fletcher play. It may not be treated as if its internal act-by-act or scene-by-scene partition were secure ground truth. This means TNK enters the analysis only through coarse whole-play texture metrics — not through scene-labeled attribution comparisons.

This is a narrower use of TNK, but a cleaner one. Under this rule, TNK is useful for the following question: Do Henry VIII and TNK both show mixed internal texture clearly above the single-author baseline? That question can be answered without assuming we know which TNK scenes belong to which author.

6. The Methods, Explained

Method 1: The Refined Stack

The main local attribution method is the "refined stack," combining three feature families. Each Henry VIII chunk is represented in three different ways, and compared against rolling windows of the same length drawn from secure Shakespeare and Fletcher reference plays.

Character trigrams (char): overlapping three-character sequences (e.g., hon, ono, nou from honour). These capture recurrent spelling shapes and some lexical habits without becoming a simple bag of full words. They are particularly useful because they are somewhat insensitive to morphological inflection.

Function-word profile (fw): high-frequency words like and, of, with, for. In authorship work, function words are prized because they are less topic-dependent than content words — a playwright uses the and that habitually, regardless of whether the scene is about love or politics.

Marker profile (marker): a small targeted set of authorship markers with a known prior in Shakespeare/Fletcher attribution — especially pronoun preferences, auxiliary verb choices, and contraction habits (see Method 4 below for detail).

For each feature family, the chunk is compared to reference windows using cosine distance. The best-matching windows on each side are identified. The method then asks: on average, is this chunk nearer to Shakespeare windows or Fletcher windows? If two of three feature families agree on one author, that is the chunk's reading. If they split or the margin is small, the chunk is Mixed.

Why these three families, and not others? Early validation runs tested five families: function words, lemmas, character trigrams, markers, and dialogue geometry. Fletcher recovery was weak for lemmas (0.389) and very weak for geometry (0.250). The refined stack drops those two weak families and focuses on the three that actually discriminate Fletcher in the secure corpus.

Method 2: Rolling Attribution

The rolling-attribution pass asks the same authorship question continuously rather than chunk-by-chunk. A fixed-size window (400 or 700 tokens) is moved through the entire play one step at a time. Each window is compared to the secure reference corpus. The result is a continuous Fletcher-proxy trace — a number between 0 and 1 at each position, where higher values indicate closer proximity to the Fletcher reference plays.

This method is the direct local equivalent of Plecháč's rolling attribution. The key difference is the feature stack (trigrams + markers + function words here, versus most-frequent words + most-frequent rhythmic types in Plecháč) and the reference corpus (16 Shakespeare / 6 Fletcher here vs. a different selection in Plecháč).

The rolling method has two advantages over chunk-based analysis: it does not depend on chunk boundary placement, and it can reveal whether apparent Fletcher pressure forms a genuine continuous block or a brief local spike.

Method 3: Verse and Rhyme

The verse-and-rhyme layer reconstructs line-by-line structure from the token sequence and measures several lineation-sensitive features: feminine-ending tendencies, strong end-stopping, shared lines between speakers, and adjacent rhyme density. This layer matters because Spedding's original argument was already lineation-sensitive, and because several scholars (including Plecháč) have found verse patterns diagnostic.

However, in this project's validation, verse features perform only moderately on the secure reference corpus. Feminine endings achieve about 0.727 accuracy for the Shakespeare/Fletcher distinction. Shared-line and end-stop claims are weaker still. The verse layer is therefore used as a secondary check rather than a primary decision criterion.

Method 4: Classic Marker Audit

Rather than simply assuming that the traditional Fletcher markers work, the project tested them against the secure corpus. The question was: which traditional markers actually separate secure Shakespeare plays from secure Fletcher plays in the local reference data?

Marker	Favors	Accuracy	Strength	Usable?
feminine ending rate	Fletcher	0.727	moderate	Yes (secondary)
ye frequency	Fletcher	0.909	strong	Cautiously (transmission-sensitive)
'em rather than them	Fletcher	0.955	strong	Yes
has rather than hath	Fletcher	0.955	strong	Yes
does rather than doth	Fletcher	0.773	moderate	Yes
colloquial contraction density	Fletcher	0.909	strong	Yes

The strongest markers are 'em/them and has/hath, each reaching 0.955 accuracy. The ye marker is strong in principle but complicated by its sensitivity to database normalization — some lemmatization pipelines collapse ye and you, which would suppress the signal. Results using ye are read with extra caution.

Crucially, even strong markers behave unevenly across Henry VIII: they are often local within a scene rather than scene-wide. This is one of the most important empirical results in the entire project — it undermines the traditional assumption that a scene with several Fletcher marker hits is a uniformly Fletcherian scene.

Method 5: Original-Spelling Check

Because spelling arguments recur in the Henry VIII tradition (Hoy's work relied partly on ye and 'em in original spelling), the project returned to a diplomatic Folger transcription of the play. This layer was handled with caution. Ordinary Folio spellings — heere, beleeue, honour — are as likely to reflect compositor practice as authorial habit. More local colloquial forms — ye, 'em, ha's — are more informative but still potentially affected by scribe or compositor.

The main value of this layer is negative: spelling evidence can intensify a pre-existing suspicion, but it does not by itself settle a chunk.

Method 6: Content-Word Frequency Profile (CWFP)

The CWFP method asks a fundamentally different question from the refined stack. It ignores function words and linguistic markers entirely, and compares texts only through recurring content lemmas — words like honour, grace, blood, power, lord, love. The vocabulary is built from ~2,567 content lemmas that appear in at least 50% of reference plays.

CWFP is useful because it provides an independent signal. If a chunk shows Fletcherward pressure in both the refined stack and CWFP, that cross-method convergence is stronger evidence than either alone. CWFP is more vulnerable to subject matter (a politically intense scene will use different content words than a comic one), but it is less vulnerable to the marker-transmission problems that affect function-word approaches.

CWFP and authorship attribution CWFP was originally developed and validated for this project on the full First Folio corpus, achieving top-1 accuracy of 97.2% in identifying which First Folio play a text window comes from. Its validation on the Shakespeare/Fletcher binary classification task achieves 0.879 overall at 700-token windows — respectable but more conservative than the refined stack.

Method 7: Genre and Mode Contamination Test

The most important methodological question in the entire project is this: could the Fletcherward pressure in suspect Henry VIII chunks simply reflect their dramatic mode rather than their authorship? Court satire might look Fletcherward. Comic crowd scenes might look Fletcherward. Processional narration might look Fletcherward — regardless of who wrote the passage.

The contamination test directly addresses this objection. For each surviving pressure chunk, the project found exact-length windows from secure Shakespeare plays that matched the Henry VIII chunk as closely as possible in dramatic geometry: same approximate speaker count, same turn structure, same concentration of speech among a small number of speakers. These matched Shakespeare windows were then put through the refined stack and CWFP.

If the matched Shakespeare controls also drift Fletcherward, the contamination explanation is supported and the Henry VIII pressure is inconclusive. If the Henry VIII chunks are consistently more Fletcherward than their matched Shakespeare controls, the contamination defense weakens.

7. Validation Before Interpretation

The asymmetry in method performance is the most important single fact about the project's evidentiary standard. It must be understood before any Henry VIII result is interpreted.

Refined stack validation

Leave-one-play-out validation on the secure corpus gives the following results:

Window size	Overall accuracy	Shakespeare recovery	Fletcher recovery
200 tokens	0.879	1.000	0.556
400 tokens	0.909	1.000	0.667
700 tokens	0.955	1.000	0.833

CWFP validation

Window size	Overall accuracy	Shakespeare recovery	Fletcher recovery
200 tokens	0.788	0.938	0.389
400 tokens	0.879	1.000	0.556
700 tokens	0.879	0.979	0.611

Figure 2. Validation accuracy on the secure reference corpus for both main attribution methods. Both methods recover Shakespeare more reliably than Fletcher. This asymmetry means that a conservative Henry VIII result — one where relatively few chunks appear Fletcherian — cannot be taken at face value as evidence against collaboration. The method is easier on Shakespeare. Any positive Fletcher finding therefore carries more weight than an absence of Fletcher findings.

The asymmetry matters for interpretation Fletcher recovery at 700-token windows is 0.833 for the refined stack and only 0.611 for CWFP. At smaller chunks (many Henry VIII chunks are 200–500 tokens), Fletcher recovery falls further. This means: when the method says "Shakespeare," we can be quite confident. When it says "Mixed" for a small chunk, some of those could be Fletcher — the method simply cannot tell.

8. The Main Henry VIII Findings

The first-principles analysis produces a result that is neither the broad traditional Fletcher map nor an all-Shakespeare reading. The play has a genuine Shakespearean backbone — visible across multiple methods and across the manual chunking, the rolling attribution trace, and the content-word profile. But a handful of locally persistent Fletcherward pockets survive the full battery of tests.

The Shakespearean backbone

The Wolsey material — the political rise and fall across Acts 1–3 — remains consistently and strongly Shakespearean across the refined stack, rolling attribution, and content-word profile. Scenes like 1.1, 1.2, 2.4, and the entire 3.2 complex (Wolsey's fall and its aftermath) are among the most securely Shakespearean regions in any attribution analysis.

The legal-ceremonial scenes — particularly the trial of Katherine in 2.4 and the Cranmer vindication in Act 5 — are also strongly Shakespearean. The play's political-historical architecture appears to be Shakespeare's in its most fundamental design.

What looked Fletcherian and then weakened

Before local subdivision, several parent chunks looked suspicious. What happened to each under finer resolution is the key story of the project:

1.3a Court satire and French fashions — Initially the best local Fletcher island under the refined stack. But under manual subdivision into three local units, only the opening satire subunit retained any Fletcher signal. The central proclamation and closing Wolsey movement turned Shakespearean. Working judgment: cautiously F-like in the outer frame, mixed internally.

4.1b Coronation procession and immediate response — Strengthened briefly under the verse-and-rhyme layer. Under local resolution it did not sustain a stable Fletcher reading. The core ceremonial description turned Shakespeare-adjacent. Working judgment: Mixed.

5.2b Council attack on Cranmer — Earlier had attracted some lineation-based pressure. Under local resolution, three of four sub-units resolved Shakespearean. Only the brief Cromwell-Gardiner quarrel retained a mild Fletcherward lineation pull. Working judgment: Sh-like.

3.2c Wolsey confronted and stripped — Showed article-recital pressure. Under local resolution, all four sub-units resolved Shakespearean. Working judgment: Sh-like.

5.3a Porter comic crowd-control — The strongest and most persistent Fletcher pressure point across all methods. Remains F-like in refined stack, rolling attribution (at both 400 and 700 tokens), and shows Fletcherward CWFP margin. Working judgment: F-like.

Figure 3. What happens when the main parent pressure chunks are divided into local sub-units. In most cases, the local sub-units are predominantly Shakespearean even when the parent chunk looked suspicious. This does not mean the pressure was imaginary — it means the old scene-scale labels were too coarse. The most important residue after subdivision is in 1.3a (partly) and 5.3a (strongly).

Figure 4. The classic Spedding/Hoy Fletcher scenes shown alongside the chunk-level working map from this study. The key observation: traditional scene-level assignments often cover regions that the chunk-level analysis shows to be internally divided or largely Shakespearean. The traditional map is not simply wrong — its pressure points are often real — but it is too coarse about the internal composition of those scenes.

Figure 5. The whole-play method matrix: for each chunk, the working ledger judgment, the refined stack result, the CWFP result, and the rolling attribution result at 400 and 700 tokens. The most important pattern: the four methods broadly agree on the Shakespeare-like backbone. Where they disagree most is in the middle of the play (Acts 2–3) and in the problem zones of 1.3 and 5.3.

Rolling Attribution: The Continuous View

The rolling attribution pass approaches Henry VIII from the opposite direction from manual chunking: it makes no assumptions about where dramatic units begin and end, and simply asks, window by window, whether each 400- or 700-token stretch is closer to the Shakespeare or Fletcher reference corpus.

The result is striking. The continuous trace is overwhelmingly Mixed, with only a very small number of decisively Fletcher-classified windows:

Window size	Total windows	Shakespeare	Fletcher	Mixed
400 tokens	184	33	3	148
700 tokens	182	32	1	149

The rolling trace does not produce a broad alternating Shakespeare/Fletcher map. The play is not neatly divisible into large Shakespeare blocks and large Fletcher blocks. Instead, most windows sit in the mixed zone, with a small number of strong Shakespeare windows in the Wolsey and trial material, and a handful of Fletcher windows concentrated in 5.3a and, more weakly, 1.3a.

This is the most direct local answer to the Plecháč-style rolling question. The current result is considerably more conservative than Plecháč's. The key methodological differences: the current project's reference corpus is more carefully validated, and the feature stack (trigrams + markers + function words) is different from Plecháč's most-frequent-word plus verse-rhythm combination. Whether this difference reflects a genuine methodological improvement, or reflects the different feature families responding to different aspects of style, is itself an open research question.

Figure 6. The rolling attribution trace across Henry VIII. Each point represents one overlapping window; higher values indicate closer proximity to the Fletcher reference plays. The strong Shakespearean windows in Acts 1–3 (the Wolsey material) are clearly visible. The Fletcher spikes are narrow and localized — not the broad continuous Fletcher blocks that scene-level tradition implies.

9. The TNK Coarse Benchmark

Under the stricter evidentiary rule, TNK is used only to ask: do Henry VIII and TNK both show coarse mixed internal texture, clearly above the single-author baseline? The answer is yes for both plays — but in different ways.

Play	Window	Windows	Mixed share	Switch rate	Entropy
Henry VIII	400	58	0.810	0.263	0.701
Henry VIII	700	57	0.772	0.232	0.775
The Two Noble Kinsmen	400	58	0.931	0.123	0.362
The Two Noble Kinsmen	700	58	0.948	0.105	0.294

TNK has a higher mixed share — more of its windows sit in the middle zone — but a lower switch rate and lower entropy. This means TNK looks steadily mixed, while Henry VIII looks more volatile: it swings more sharply between Shakespeareward and Fletcherward readings. Both plays are well above the single-author baseline in internal texture complexity.

This difference between the two plays is itself informative. If the method simply flattened all plays into undifferentiated ambiguity, Henry VIII and TNK would look identical. They do not. Henry VIII is more volatile; TNK is more uniformly mixed. That distinction is consistent with Henry VIII having a strong Shakespearean backbone with a smaller number of concentrated Fletcherward pockets, while TNK has more evenly interleaved authorial contributions.

Figure 7. Rolling Fletcher-proxy traces for Henry VIII and TNK at 700-token windows. TNK (right) is more steadily mixed; Henry VIII (left) fluctuates more sharply. The narrow Fletcherward spikes in Henry VIII contrast with TNK's more uniform mixed texture. Note that the current project does not use this comparison to locate Fletcher in TNK — it uses it only to ask whether both plays are recognizably mixed relative to single-author plays.

Figure 8. Whole-play mixed-texture metrics for Henry VIII, TNK, and the secure single-author reference plays. Both Henry VIII and TNK sit clearly above the single-author plays in mixed-share and several variance metrics. The difference between them illustrates that the method is sensitive to the character of the mixing, not just its presence.

10. CWFP Results

The CWFP pass provides a new kind of evidence because it is based on common content words rather than function words or markers. A chunk that shows Fletcherward pressure in CWFP cannot be dismissed by arguing that the refined stack merely reflects marker frequency or function-word habit.

Summary of CWFP classifications

Shakespeare-leaning chunks

Fletcher chunk (5.4b)

Mixed chunks

CWFP does not restore the traditional large Fletcher share. Only one chunk receives a clear Fletcher classification — 5.4b (Cranmer's first prophecy movement), whose nearest CWFP neighbor is Fletcher's Valentinian. But the margin pattern is more informative than the classification count: many chunks that CWFP classifies as Mixed are slightly closer to the Fletcher reference than to Shakespeare, and these subtle margins converge with other methods in consistent ways.

Most Fletcher-leaning by CWFP

Chunk	Label	CWFP Fletcher margin	Nearest play
`5.4b`	Cranmer's first prophecy	−0.025	Fletcher / Valentinian
`3.2d`	Wolsey after the fall	−0.017	Fletcher / Valentinian
`3.1b`	Cardinals with Katherine	−0.016	Fletcher / Valentinian
`4.2b`	Griffith's defense and Katherine's vision	−0.010	Fletcher / Valentinian
`4.1b`	Coronation procession	−0.010	Fletcher / The Mad Lover
`2.2b`	Wolsey-Campeius divorce management	−0.009	Fletcher / Valentinian
`5.3a`	Porter comic crowd-control	−0.009	Fletcher / The Woman's Prize

Most Shakespeare-leaning by CWFP

Chunk	Label	CWFP Shakespeare margin	Nearest play
`1.1c`	Buckingham arrest and fatal exit	+0.041	Shakespeare / Measure for Measure
`2.4d`	King's conscience narrative	+0.039	Shakespeare / King Lear
`1.2c`	Buckingham matter and surveyor	+0.037	Shakespeare / Measure for Measure
`2.4c`	King's vindication of Wolsey	+0.036	Shakespeare / Antony and Cleopatra
`3.2a`	Noble coalition and exposure	+0.033	Shakespeare / Measure for Measure
`1.1b`	Wolsey grievance and foreign-policy	+0.030	Shakespeare / Henry V

Figure 9. The surviving pressure chunks under the two main attribution methods. Negative margin means the chunk is more Fletcherward. The overlaid points show the number of validated Fletcherward markers in the same chunk. The convergence of negative CWFP margin, negative refined margin, and high marker count in the same chunks is the strongest available evidence for locally persistent Fletcherward pressure.

CWFP–Marker Crosswalk

Cross-method convergence is the strongest form of evidence available. The following chunks show both high validated marker hits (≥3) and a Fletcherward CWFP margin:

Chunk	Label	Marker hits	CWFP nearest	Refined result
`1.3a`	Court satire and French fashions	3	Fletcher / The Mad Lover	F-like
`2.2b`	Wolsey-Campeius divorce management	3	Fletcher / Valentinian	Sh-like
`3.1b`	Cardinals with Katherine	3	Fletcher / Valentinian	Mixed
`3.2d`	Wolsey after the fall	3	Fletcher / Valentinian	Sh-like
`4.1a`	Gentlemanly prelude to coronation	3	Fletcher / The Mad Lover	Sh-like
`4.1c`	Third Gentleman narrates coronation	3	Fletcher / The Mad Lover	Sh-like
`5.4c`	Final prophecy and king's answer	3	Fletcher / Valentinian	Mixed

Note that several of these chunks — 2.2b, 3.2d, 4.1a, 4.1c — are classified as Shakespeare-like by the refined stack despite having high marker counts and Fletcherward CWFP neighbors. This is an important interpretive point: it suggests that the marker and content-word signals in these chunks may be picking up dramatic mode (ceremonial narration, counselor banter, public accusation) rather than authorial identity. The genre/mode contamination test was designed precisely to investigate this possibility.

11. The Genre/Mode Contamination Test: Results

This is the methodological crux of the project. For each surviving pressure chunk, the test found 20 exact-length windows from secure Shakespeare plays with the closest available dramatic geometry (speaker count, turn structure, speaker concentration). Those confirmed-Shakespeare windows were then run through the refined stack and CWFP. If they also drift Fletcherward, the Henry VIII pressure could be dismissed as a mode effect. If they do not, the contamination defense weakens.

Key result: the contamination defense largely fails Under the refined stack, matched Shakespeare controls produced zero Fletcher classifications across all tested pressure chunks. Under CWFP, matched Shakespeare controls also produced zero Fletcher classifications. Several Henry VIII chunks — especially 1.3a, 5.3a, 3.1b, and 3.2d — are more Fletcherward than nearly all of their matched Shakespeare controls.

Chunk	Label	Refined target	Matched Sh controls (F/M/S)	Refined percentile	CWFP target	Matched Sh controls (F/M/S)	CWFP percentile
`1.3a`	Court satire	F-like	0 F / 1 M / 19 S	0.000	Mixed	0 F / 9 M / 11 S	0.100
`2.2b`	Wolsey-Campeius	Sh-like	0 F / 0 M / 20 S	0.000	Mixed	0 F / 10 M / 10 S	0.000
`3.1b`	Cardinals with Katherine	Mixed	0 F / 0 M / 20 S	0.000	Mixed	0 F / 5 M / 15 S	0.000
`3.2d`	Wolsey after the fall	Sh-like	0 F / 0 M / 20 S	0.000	Mixed	0 F / 6 M / 14 S	0.000
`4.1a`	Gentlemanly prelude	Sh-like	0 F / 1 M / 19 S	0.050	Mixed	0 F / 11 M / 9 S	0.250
`4.1c`	Third Gentleman narrates	Sh-like	0 F / 0 M / 20 S	0.000	Mixed	0 F / 8 M / 12 S	0.000
`5.3a`	Porter crowd-control	F-like	0 F / 0 M / 20 S	0.000	Mixed	0 F / 12 M / 8 S	0.000
`5.4c`	Final prophecy	Mixed	0 F / 3 M / 17 S	0.000	Mixed	0 F / 12 M / 8 S	0.100

The percentile column is the key number. A percentile of 0.000 means the Henry VIII chunk is more Fletcherward than every single one of the 20 matched Shakespeare controls. Most of the live pressure chunks achieve this. This is a strong result: it means the contamination objection cannot explain away the pressure in these specific locations.

What the contamination test does not prove The test shows that comparable Shakespeare scenes in the secure corpus do not drift Fletcherward in the same way the Henry VIII pressure chunks do. This changes the burden of proof — you can no longer dismiss the pressure as a normal mode effect. But it does not prove that Fletcher wrote these chunks. Other explanations remain possible: Shakespeare's style at this point in his career was unusually close to Fletcher's; the passages involved unusual dramatic modes that Shakespeare rarely used elsewhere; or transmission effects affected exactly these passages. The test narrows the field of explanations, but does not eliminate all alternatives.

Figure 10. Percentile of each live Henry VIII pressure chunk among its matched Shakespeare controls. Lower percentile means the Henry VIII chunk is more Fletcherward than most of the Shakespeare controls matched to it in dramatic geometry. The near-zero percentiles for 1.3a, 5.3a, 3.1b, 3.2d, 4.1c, and 2.2b are the most important numbers in this chart.

12. Comparing the Current Map to the Traditional Scene Partition

The following table shows each of the sixteen scenes in Henry VIII, indicating whether traditional scholarship assigned it to Fletcher, how many validated marker hits the current analysis found, and how the current chunk-level analysis distributes the scene's local units.

Scene	Traditional Fletcher scene?	Validated marker hits	Current Sh-like chunks	Current Mixed chunks	Current F-like chunks
1.1	No	1	3	—	—
1.2	No	1	3	—	—
1.3	Yes (Spedding)	3	—	—	1
1.4	Yes (Spedding)	2	2	1	—
2.1	Yes (Spedding)	3	2	1	—
2.2	Yes (Spedding)	2	2	—	—
2.3	No	1	—	2	—
2.4	No	1	3	1	—
3.1	Yes (Spedding)	3	—	2	—
3.2	No (partly in some traditions)	—	4	—	—
4.1	No (Spedding had it Sh)	3	2	1	—
4.2	Yes (Spedding)	1	2	1	—
5.1	No	—	3	1	—
5.2	Yes (some traditions)	1	1	2	—
5.3	Yes (Spedding)	2	—	—	1
5.4	Yes (Spedding)	1	1	1	—

The most striking pattern: among the nine scenes that Spedding assigned to Fletcher, the current chunk-level analysis finds only two scenes that contain any F-like chunks (1.3 and 5.3). The other seven scenes that Spedding called Fletcher territory contain chunks that resolve as Shakespearean or Mixed under the current analysis. This does not mean Spedding was entirely wrong — his pressure instincts were often correct at the scene level. It means the Fletcher signal within those scenes is often local rather than scene-wide.

13. Conclusions

The findings can be stated in a way that is both cautious and affirmative. They do not require choosing between "Spedding was right" and "the play is all Shakespeare." A more specific and better-supported picture is available.

Finding 1: A Shakespearean backbone is real The Wolsey material (Acts 1–3), the trial of Katherine (2.4), the Cranmer vindication (Act 5), and most of the play's legal-historical architecture are consistently Shakespearean across the refined stack, rolling attribution, CWFP, and the subchunk resolution studies. This result is robust across multiple methods and multiple units of analysis.

Finding 2: A few locally persistent Fletcherward pockets survive After all validation, subchunk resolution, and contamination testing, a small set of chunks remain consistently more Fletcherward than matched Shakespeare controls: most strongly 5.3a (Porter crowd-control), and with lesser confidence 1.3a (court satire), 3.1b (Cardinals with Katherine), and 5.4c (final prophecy). These pockets are real: they are not easily explained by dramatic mode, they survive content-word profiling, and they sit well outside the distribution of matched Shakespeare controls.

Finding 3: The traditional broad scene map is too coarse The Spedding/Hoy scene-level partition overstates both the continuity and the extent of the Fletcher signal. Many scenes traditionally assigned to Fletcher — including large parts of 2.1, 2.2, 3.1, 4.2, 5.2, and 5.4 — contain substantial Shakespearean material under chunk-level analysis. The Fletcher signal within these scenes is often local (a sub-block, a particular mode, a brief exchange) rather than scene-wide.

Finding 4: TNK is a definitely mixed play but not a scene-labeled answer key The coarse benchmark confirms that both Henry VIII and TNK show mixed internal texture clearly above the single-author baseline. Henry VIII is more volatile in its mixing; TNK is more uniformly mixed. Using TNK as a labeled validation case — as some earlier workflows were tempted to do — would have introduced circular reasoning, and the present project's decision to reject that move is methodologically important.

Finding 5: The best picture is intermediate Henry VIII is neither cleanly divisible into large Fletcher scenes nor reducible to an all-Shakespeare reading. It is better described as a largely Shakespearean play with a small number of locally persistent Fletcherward pockets — narrower, more fragile, and more concentrated than the classic scene tradition implies. The most important revision to received wisdom is not "Fletcher disappears." It is: "the broad old map overstates how continuous and how scene-wide the Fletcher signal is."

What would change the picture

The following findings would substantially revise the current conclusions:

A larger, better-balanced Fletcher reference corpus (currently only 6 plays) that improves Fletcher recovery in validation — this would allow stronger positive Fletcher claims
A validated spelling investigation on the diplomatic Folger text that shows systematic colloquial contraction patterns aligning with the pressure chunks — this would strengthen the marker evidence
A Plecháč-style replication that uses the same feature stack as the current project but finds broader Fletcher blocks — this would indicate the difference is in the reference corpus rather than the features
Newly discovered documentary evidence linking Fletcher to the play — this would remove the inference problem entirely, though it would not settle the internal map

Appendix: Complete Chunk-by-Chunk Method Ledger

The following table gives all 46 chunks with their scene assignment, token count, short label, and judgments from the four main analysis layers: the refined stack, CWFP, rolling attribution at 400 and 700 tokens, validated marker hit count, and the working judgment that synthesizes across methods.

Chunk	Scene	Tokens	Label	Refined	CWFP	Roll 400	Roll 700	Markers	Judgment
`1.1a`	1.1	681	Field-of-Cloth spectacle	Sh	Mix	Mix	Mix	1	Sh-like
`1.1b`	1.1	1159	Wolsey grievance and corruption	Sh	Sh	Mix	Mix	3	Sh-like
`1.1c`	1.1	237	Buckingham arrest	Sh	Sh	Sh	Mix	2	Sh-like
`1.2a`	1.2	423	Royal opening and grievance	Sh	Sh	Mix	Mix	1	Sh-like
`1.2b`	1.2	437	Taxation debate and revocation	Sh	Mix	Mix	Mix	2	Sh-like
`1.2c`	1.2	871	Buckingham matter and surveyor	Sh	Sh	Mix	Mix	2	Sh-like
`1.3a`	1.3	575	Court satire, French fashions	Fl	Mix	Fl	Mix	3	F-like
`1.4a`	1.4	293	Banquet wit and erotic banter	Sh	Mix	Mix	Mix	3	Sh-like
`1.4b`	1.4	228	Wolsey's feast under interruption	Mix	Mix	Mix	Mix	2	Mixed
`1.4c`	1.4	373	Masque recognition, Anne singled out	Sh	Mix	Mix	Mix	3	Sh-like
`2.1a`	2.1	441	Gentlemen on Buckingham's trial	Sh	Mix	Mix	Mix	3	Sh-like
`2.1b`	2.1	708	Buckingham's public valediction	Mix	Mix	Mix	Mix	2	Mixed
`2.1c`	2.1	262	Aftermath and divorce rumor	Sh	Mix	Mix	Mix	2	Sh-like
`2.2a`	2.2	674	Anti-Wolsey counsel	Sh	Mix	Mix	Mix	3	Sh-like
`2.2b`	2.2	488	Wolsey-Campeius divorce management	Sh	Mix	Mix	Mix	3	Sh-like
`2.3a`	2.3	412	Anne's pity, anti-eminence talk	Mix	Mix	Mix	Mix	2	Mixed
`2.3b`	2.3	475	Elevation of Anne	Mix	Mix	Mix	Mix	2	Mixed
`2.4a`	2.4	438	Ceremonial opening, Katherine's plea	Sh	Sh	Mix	Mix	1	Sh-like
`2.4b`	2.4	601	Wolsey-Campeius answer, Katherine's refusal	Mix	Sh	Mix	Mix	—	Mixed
`2.4c`	2.4	28	King's vindication of Katherine	Sh	Sh	Mix	Mix	2	Sh-like
`2.4d`	2.4	595	King's conscience narrative	Sh	Sh	Mix	Mix	2	Sh-like
`3.1a`	3.1	165	Katherine's melancholy and reception	Mix	Mix	Mix	Mix	2	Mixed
`3.1b`	3.1	1347	Cardinals with Katherine	Mix	Mix	Mix	Mix	3	Mixed
`3.2a`	3.2	607	Noble coalition, anti-Wolsey exposure	Sh	Sh	Mix	Mix	—	Sh-like
`3.2b`	3.2	679	Packet exchange, Wolsey's alarm	Sh	Sh	Mix	Mix	2	Sh-like
`3.2c`	3.2	123	Wolsey confronted and stripped	Sh	Mix	Mix	Mix	3	Sh-like
`3.2d`	3.2	935	Wolsey after the fall	Sh	Mix	Mix	Mix	3	Sh-like
`4.1a`	4.1	295	Gentlemanly prelude to coronation	Sh	Mix	Mix	Mix	3	Sh-like
`4.1b`	4.1	157	Coronation procession	Mix	Mix	Mix	Mix	2	Mixed
`4.1c`	4.1	53	Third Gentleman narrates coronation	Sh	Mix	Mix	Mix	3	Sh-like
`4.2a`	4.2	358	Katherine judges Wolsey	Sh	Mix	Mix	Mix	2	Sh-like
`4.2b`	4.2	277	Griffith's defense, Katherine's vision	Mix	Mix	Mix	Mix	1	Mixed
`4.2c`	4.2	745	Capuchius, Katherine's final petitions	Sh	Mix	Mix	Mix	3	Sh-like
`5.1a`	5.1	472	Gardiner and Lovell on Cranmer	Sh	Sh	Mix	Mix	2	Sh-like
`5.1b`	5.1	223	King, Suffolk, queen's labor	Sh	Mix	Mix	Mix	—	Sh-like
`5.1c`	5.1	584	King and Cranmer in private	Mix	Mix	Mix	Mix	2	Mixed
`5.1d`	5.1	16	Birth report, Old Lady comic close	Sh	Mix	Mix	Mix	2	Sh-like
`5.2a`	5.2	289	Cranmer waiting, king observes	Mix	Mix	Mix	Mix	2	Mixed
`5.2b`	5.2	787	Council attack on Cranmer	Sh	Mix	Mix	Mix	3	Sh-like
`5.2c`	5.2	735	Ring and royal restoration	Mix	Mix	Mix	Mix	2	Mixed
`5.3a`	5.3	592	Porter comic crowd-control	Fl	Mix	Fl	Fl	2	F-like
`5.3b`	5.3	138	Chamberlain restores order	—	Mix	Mix	Mix	1	—
`5.4a`	5.4	67	Procession naming, opening blessing	—	Mix	Mix	Mix	—	—
`5.4b`	5.4	359	Cranmer's first prophecy	Sh	Fl	Mix	Mix	1	Sh-like
`5.4c`	5.4	182	Final prophecy and king's answer	Mix	Mix	Mix	Mix	3	Mixed
`5.4d`	5.4	123	Epilogue	—	Mix	Mix	Mix	2	—

Works Cited and Comparison Points

Spedding, James. "Who Wrote Shakespeare's Henry VIII?" Gentleman's Magazine (1850). Princeton archive.
Hoy, Cyrus. "The Shares of Fletcher and His Collaborators in the Beaumont and Fletcher Canon." Studies in Bibliography 8–15 (1956–1962).
Hope, Jonathan. The Authorship of Shakespeare's Plays: A Socio-linguistic Study. Cambridge: Cambridge UP, 1994. Cambridge chapter.
Vickers, Brian. Shakespeare, Co-Author: A Historical Study of Five Collaborative Plays. Oxford: OUP, 2002. OUP.
Craig, Hugh and Arthur F. Kinney. Shakespeare, Computers, and the Mystery of Authorship. Cambridge: Cambridge UP, 2009.
Plecháč, Petr. "Relative contributions of Shakespeare and Fletcher in Henry VIII: An Analysis Based on Most Frequent Words and Most Frequent Rhythmic Patterns." Digital Scholarship in the Humanities 36.2 (2021): 430–438. arXiv preprint · Published version · Replication data.
Sharpe, Will, ed. King Henry VIII; or All is True. New Oxford Shakespeare. Oxford: OUP, 2016.
Sharpe, Will. Shakespeare & Collaborative Writing. Oxford Shakespeare Topics. Oxford: OUP, 2023.
Margeson, John, intro. Henry VIII. Cambridge: Cambridge UP. Cambridge appendix.
Folger Shakespeare Library. Henry VIII Appendix on Authorship. Folger.
Stanford Global Shakespeare Encyclopedia, "Henry VIII" entry. Stanford.

Henry VIII Authorship Project · March 2026 · Codex research workspace
Database: early_modern_plays.db · 527 plays · Target: play ID 502 · Chunks: 46

Henry VIII and The Two Noble Kinsmenfrom First Principles

1. The Problem

Why this is a hard problem

2. The History of Scholarship

Recent Literature (2020–2026)

3. Data, Texts, and the Reference Corpus

The database

The secure reference corpus

Six governing assumptions

4. How Henry VIII Was Manually Chunked

Why scene level was not enough

Principles of chunking

Representative chunking examples

5. How The Two Noble Kinsmen Was (and Was Not) Used

6. The Methods, Explained

Method 1: The Refined Stack

Method 2: Rolling Attribution

Method 3: Verse and Rhyme

Method 4: Classic Marker Audit

Method 5: Original-Spelling Check

Method 6: Content-Word Frequency Profile (CWFP)

Method 7: Genre and Mode Contamination Test

7. Validation Before Interpretation

Refined stack validation

CWFP validation

8. The Main Henry VIII Findings

The Shakespearean backbone

What looked Fletcherian and then weakened

Rolling Attribution: The Continuous View

9. The TNK Coarse Benchmark

10. CWFP Results

Summary of CWFP classifications

Most Fletcher-leaning by CWFP

Most Shakespeare-leaning by CWFP

CWFP–Marker Crosswalk

11. The Genre/Mode Contamination Test: Results

12. Comparing the Current Map to the Traditional Scene Partition

13. Conclusions

What would change the picture

Appendix: Complete Chunk-by-Chunk Method Ledger

Works Cited and Comparison Points

Henry VIII and The Two Noble Kinsmen
from First Principles