Wikipedia talk:Manual of Style/Capital letters

Capitalization discussions ongoing (keep at top of talk page)

[edit]

Add new items at top of list; move to Concluded when decided, and summarize the conclusion. Comment at them if interested. Please keep this section at the top of the page.

Current

[edit]

(newest on top) Move requests:

  • None noted

Other discussions:

Concluded

[edit]

Sentence case or title case for multi-word WikiProject names?

[edit]

There seems to be an inconsistency in how the capitalization of multi-word WikiProject names are handled. Title case examples include: WP:WikiProject Crime and Criminal Biography and WP:WikiProject Ice Hockey. Sentence case examples include: WP:WikiProject Video games, WP:WikiProject Classical music, and WP:WikiProject College football. Has there ever been a consensus or guideline about this? Left guide (talk) 09:34, 22 April 2025 (UTC)[reply]

I had brought up this issue at this RM: WT:WikiProject Women's Health#Requested move 11 April 2025. Still open. My impression is that the majority are named such that the words after "WikiProject" match the title of a main article, i.e. sentence case. When that's not the case, more work is needed to avoid linking over-capitalized redirects, and things like that. Dicklyon (talk) 06:40, 23 April 2025 (UTC)[reply]
Surely actual sentence case would be "WP:WikiProject crime and criminal biography", "WP:WikiProject ice hockey", "WP:WikiProject video games", etc. I don't know what style name to call capitalizing the first and second words and then not capitalizing the rest. —David Eppstein (talk) 07:13, 23 April 2025 (UTC)[reply]
I guess the word "WikiProject" is considered a de facto prefix since it exists on every WikiProject title, at least that's how I'm able to make sense of it. Left guide (talk) 07:16, 23 April 2025 (UTC)[reply]
When we use sentence case, even when things start with stopper words like "The", we don't use it as an excuse to capitalize only the second word of the title. The namespace prefix is the "WP:", not what comes after it. And articles such as [1] that talk about these projects include "WikiProject" as part of the project title, not just a piece of syntax used internally to refer to the project. —David Eppstein (talk) 07:45, 23 April 2025 (UTC)[reply]
Indeed, the common capitalization pattern is not very logical. But it is the dominant convention, so I'm not in favor of changing it. I just want to fix the few that are more capitalized that usual, since they tend to cause links to show up on Wikipedia:Database reports/Linked miscapitalizations in template space. Dicklyon (talk) 05:38, 24 April 2025 (UTC)[reply]
My perspective is that, if anything, WikiProject should be at the end of these names. Hey man im josh (talk) 15:34, 23 April 2025 (UTC)[reply]
There are plenty of university departments of mathematics (lowercase meaning that they are departments that house mathematicians and teach mathematics) whose proper name is Department of Mathematics, others whose proper name is Mathematics Department, and others that have other names entirely. If you consider starting a new WikiProject, you could consider naming it WP:Onomastics WikiProject, just to introduce some of that same name diversity into our system. But the existing WikiProjects have names (proper noun phrases) that happen to have WikiProject in front, and in many cases have the other words capitalized in their name. You wouldn't suggest changing the capitalization of "United Kingdom of Great Britain and Northern Ireland" to "United kingdom of Great Britain and Northern Ireland" merely because "kingdom" is also a common English word and is used with its usual meaning in that phrase, would you? This suggestion strikes me as in the same vein. —David Eppstein (talk) 17:29, 23 April 2025 (UTC)[reply]
That's a bit too radical and disruptive, I expect. But moving a few projects to the most common style should be a lot easier to swallow. It's not about MOS compliance per se, since these are not in article space, but making this change will make it easier to maintain links to over-capitalized redirects, which is what brought this up. Dicklyon (talk) 17:40, 23 April 2025 (UTC)[reply]
"The most common style" being first and second words capitalized, rest lowercase?? —David Eppstein (talk) 17:56, 23 April 2025 (UTC)[reply]
Pretty much so. I.e. the most common style is "WikiProject" followed by a main article title as if it's sentence-initial, like WP:WikiProject College football. Dicklyon (talk) 05:42, 24 April 2025 (UTC)[reply]
If that's what those WikiProjects want to name themselves then ok, but I think it's a stupid style to impose on others that want proper capitalization for their proper noun phrase names. —David Eppstein (talk) 06:52, 24 April 2025 (UTC)[reply]
It's not about proper nouns, but about how to most easily keep the project names, category names, project templates on talk pages working corrrectly together when the main topic is not a proper noun phrase. Dicklyon (talk) 23:37, 25 April 2025 (UTC)[reply]
Who cares whether the name of the WikiProject, a proper noun phrase, was constructed using English words that happen to be non-proper noun phrases or with made-up neologisms? It is still the name of the WikiProject, a proper noun phrase, exactly as the United Kingdom of Great Britain and Northern Ireland is a proper noun phrase that happens to have been constructed using non-proper nouns and adjectives united, kingdom, of, great, and, northern. —David Eppstein (talk) 01:06, 26 April 2025 (UTC)[reply]
Who cares whether you think of WikiProject names as proper noun phrases? Most of them are not treated that way. Dicklyon (talk) 04:52, 27 April 2025 (UTC)[reply]
I care, since whether something is a proper noun is entirely relevant to whether it's capitalized. Hey man im josh (talk) 15:28, 30 April 2025 (UTC)[reply]

What? Now there's an attempt to lowercase all WikiProject titles, too? GoodDay (talk) 05:20, 27 April 2025 (UTC)[reply]

Absolutely not. Requires renaming thousands of pages, breaking categories, modules, links and templates all over the place, for no benefit whatsoever. Hawkeye7 (discuss) 05:35, 27 April 2025 (UTC)[reply]
Good Day is confused when he says "there's an attempt to lowercase all WikiProject titles". I don't have a complete list, but it looks like there about 800 or so WikiProjects with multiword "main article" names; the majority of those are proper names (e.g. WP:WikiProject James Bond), and are not at issue here at all. Of the 300 or so that are not proper names, about 230 already have the name in the Wikiproject matching the capitalization of the main article. Only about 70 are the subject of this discussion; e.g. WP:WikiProject Ice Hockey with main article Ice hockey. So, dozens, not thousands. Getting these fixed without breaking anything should be straightforward, but will take some time. Dicklyon (talk) 02:42, 28 April 2025 (UTC)[reply]
I am here as lead coordinator of Wikipedia:WikiProject Military history (sentence case). And I am telling you that changing it is not simple or straightforward at all. We have thousands of pages, extensive archives, templates, Lua modules and bots. And there is no way that you could make such a complex change without breaking anything. Hawkeye7 (discuss) 04:11, 28 April 2025 (UTC)[reply]
@Hawkeye7:, I don't doubt that the push for lower-casing will eventually succeed, in this area. It'll never stop, until it does. GoodDay (talk) 02:58, 28 April 2025 (UTC)[reply]
  • It should be up to the members of each WikiProject what they want to call themselves and how they want to style it. For example, WikiProject:Civil Rights Movement is uppercased due to the fact that the 1954-1968 Civil Rights Movement is a proper name, which is apparant to involved editors. It's inactive as far as I know, but still there and still properly named. Randy Kryn (talk) 03:38, 28 April 2025 (UTC)[reply]
    Seconding this, here we go again with the lowercase crusading. Leave it up to individual Wikiprojects to decide their names, this doesn't have anything to do with article-space content and therefore (IMO) the MoS only loosely applies. The Kip (contribs) 04:09, 28 April 2025 (UTC)[reply]
    The MoS is strictly limited to articles. Changing "articles" to "pages" (or any change broadening MOS's scope of applicability) would require a widely advertised RfC. (WP:MOS) Hawkeye7 (discuss) 04:23, 28 April 2025 (UTC)[reply]
    Thanks, further backs up my opinion. The Kip (contribs) 04:23, 28 April 2025 (UTC)[reply]
    I agree, it's not an MOS issue. It interacts a bit with MOS-related maintenance through reports of linked miscapitalizations, as I said at the start. And those can be dealt with in other ways, so I'll work on those other ways if there's no appetite for more consistent project naming. Dicklyon (talk) 05:32, 28 April 2025 (UTC)[reply]
  • As one of the people in WikiProject Crime: who cares? It's a WikiProject. WikiProjects are only of relevance to a small set of highly interested editors so unless those editors care why should we. Also, as someone who has done them, WP moves are a nightmare every time. You have to move sometimes thousands of pages. We wanted to move it back to WikiProject Crime (the current, lengthy name is a result of a very long 2000s era argument that was not properly resolved until 2023) but we decided it was not worth the hassle. PARAKANYAA (talk) 05:46, 28 April 2025 (UTC)[reply]
    Seconded. Any gain made in increased consistency will be completely outweighed by the enormous amount of work to change all of the projects where this applies, and then retrain everyone used to the current names to use the new ones. Doing this for something that isn't even reader-facing is a total waste of effort that could be spent on something more useful, like watching paint dry. Caeciliusinhorto-public (talk) 09:13, 28 April 2025 (UTC)[reply]
  • Let WikiProjects name themselves! Good grief. I know this thread has gone stale, but I thought I'd share another example: WP:WikiProject LGBTQ+ studies last year changed its name to add the ⟨+⟩ even though consensus has not been to add this to LGBTQ people and all other related articles. Note that many of the same editors contribute to discussions about the WikiProject name and related article titles. The Project name reflects the perspective and preferences of its participants, even while acknowledging that usage and other factors recommend a different form in article titles. Imposing conformity from the outside is a waste of time and energy that does not improve the encyclopedia. PS: I realize this example is not CAPS related but principle is the same. --MYCETEAE 🍄‍🟫—talk 23:02, 17 June 2025 (UTC)[reply]
    Yeah, this should have been closed. It turns out there's a template parameter that provides an easy fix to avoid linking over-capitalized redirects in teh project template. E.g. in Template:WikiProject Ice Hockey, using |MAIN_ARTICLE = ice hockey prevents linking through the over-capitalized redirect Ice Hockey, and shows up as "This article is within the scope of WikiProject Ice Hockey, a collaborative effort to improve the coverage of ice hockey on Wikipedia." The MAIN_ARTICLE parameter just needs to be implemented on more projects. Dicklyon (talk) 23:13, 17 June 2025 (UTC)[reply]
    Good that there is a technical fix! --MYCETEAE 🍄‍🟫—talk 11:30, 23 June 2025 (UTC)[reply]

RfC on the meaning of "usually" as used in MOS:MILTERMS

[edit]

Should the spirit and intent of usually capitalized in sources at MOS:MILTERMS be taken as consistent with the general advice on capitalisation given in the lead of MOS:CAPS or is the spirit and intent to create a substantially different and lower threshold for capitalising the types of events named.

Cinderella157 (talk) 03:41, 4 June 2025 (UTC)[reply]

The subject text at MOS:MILTERMS is as follows:

Accepted names of wars, battles, revolts, revolutions, rebellions, mutinies, skirmishes, fronts, raids, actions, operations, and so forth are capitalized if they are usually capitalized in sources (Spanish Civil War, Battle of Leipzig, Boxer Rebellion, Action of 8 July 1716, Western Front, Operation Sea Lion).

The matter is discussed above in the section MOS:MILTERMS.

Please comment by indicating Consistent or Lower. Cinderella157 (talk) 03:42, 4 June 2025 (UTC)[reply]

Notified at MILHIST. Cinderella157 (talk) 03:48, 4 June 2025 (UTC)[reply]

  • Consistent MOS:MILTERMS is part of MOS:CAPS. The opening paragraph of MOS:MILTERMS states: The general rule is that wherever a military term is an accepted proper name, as indicated by consistent capitalization in sources, it should be capitalized. The general advice in the lead paragraph of MOS:CAPS is often paraphrased as requiring consistent capitalisation. Some have argued that usually herein means any degree of usage just greater than 50%. As Firefangledfeathers notes above: It's odd to see an unexplained clash between the general rule and the specific rule, and it's untenable to have the clash be open to interpretation. However, such an interpretation clashes not only with the general rule but the more proximate rule in the lead paragraph at MILTERMS. The issue is not just whether usually should reasonably be interpreted as greater than 50% but whether doing so reflects the spirit and intent of the guidance. At multiple places, we are told that the spirit of any P&G is paramount rather than skirting the spirit on some technicality - perceived or real (eg WP:P&G, WP:5P, WP:IAR?, WP:PRINCIPLE, WP:MR and WP:LAWYERING). If the spirit of using usually is intended to create a lower threshold then we would need a substantive reason for doing so.
The Merriam-Webster definition for usually is: according to the usual or ordinary course of things : most often : as a rule : customarily, ordinarily. This source collates linguistic studies on how various terms (including usually) are usually perceived as percentages - reporting that usually is perceived as 70 - 84 percent of the time. It also gives the definition from the OED: In a usual or wonted manner; according to customary, established, or frequent usage; commonly, customarily, ordinarily; as a rule. Those arguing a lower threshold would seize on one part of the definition most often as being just greater than equal. As with any law, rule etc, the meaning of a definition should be read in the fuller context and a balance of all the parts. Seizing on one part in isolation is the epitome of a WP:PETTIFOGING argument. Considering the definition and linguistic interpretation of usually, the meaning is consistent with both the general advice in the lead of MOS:CAPS and the lead paragraph of MILTERMS. These are subject to the same conflicting views on whether these are proper names as any other name on WP which is descriptive and take the definite article in prose - unless they are consistently capped in sources.
Many editors are of a view that any name having a specific referent is a proper name that must be capitalised. While a specific referent is a property of a proper name, it is not a defining property since specificity of referent is also conveyed by the definite article (the). If there is anything that defines a proper name, it is that it is not descriptive. However, it is because of these different views that MOS:CAPS relies on consistent capitalisation in sources to determine what we capitalise rather than semantic arguments of what defines a proper name. This is the consensus of the broader community and is reflected by the consensus of a vast majority of (RM) discussions both generally and more specifically for battles, wars etc. As a group, the names identifying many battles, wars etc take the definite article in prose and are inherently descriptive - eg the Battle of Waterloo is a battle that occurred near Waterloo. As a group, these are commonly capitalised but there are significant number of exceptions for specific battles, wars etc such as the Syrian civil war, where Syrian civil war is not consistently capped in sources. There is no apparent substantive reason why these should be considered as a group as an exception from the general guidance, particularly when the lead paragraph at MILTERMS reinforces that the general guidance apples to MILTERMS.
Asserting that usually creates a lesser standard than the general guidance is clearly contrary to the usual meaning and the spirit and intent, reading in the fuller context of MOS:CAPS as a whole and the more specific guidance at MILTERMS. Cinderella157 (talk) 03:43, 4 June 2025 (UTC)[reply]
Many editors are of a view that any name having a specific referent is a proper name that must be capitalised. Those are the rules of the English language: "Names of people, places and organisations are called proper nouns. We spell proper nouns with a capital letter"[2] While a specific referent is a property of a proper name, it is not a defining property since specificity of referent is also conveyed by the definite article (the). "The" is not necessary to make something a "specific referent", we say "Berlin", not "The Berlin"; adding "the" is an exception that arose through use, e.g. "The Grand Canyon". TurboSuperA+(connect) 09:50, 10 June 2025 (UTC)[reply]
Yes, we do capitalise proper nouns. This is not disputed. However, because something is spelled with a capital letter, that does not make it ipso facto a proper noun|name. English often capitalises descriptive names for emphasis, significance or as a term of art. If you read the Merriam-Webster definition or our article proper noun you will see that proper nouns are also not descriptive. I was not saying that proper nouns must take the definite article (the) to be specific (as you would indicate with the example Berlin). What I was saying is that the definite article confers specificity and therefore, specificity of referent is not a defining property of a proper noun. Consequently, names such as the Cimean B|blocade or the Syrian C|civil W|war are not ipso facto proper nouns because they take the definite article in prose. Your example the Grand Canyon is considered a proper noun even though it might appear descriptive (the canyon which is grand), This is partly because it is common to capitalise descriptors such as canyon, bay, sea etc (but not all descriptors) in geographical names. Secondly, we should not be confused by the etymology of the name where somebody said this looks grand, let's call it the Gand Canyon since they might just as easily called it something else like Kings Canyon. The ngram for Grand Canyon here is pretty much always capped compared with Syrian civil war here [contexturalised for prose]. However, because WP relies on usage in sources to determine capitalisation, we capitalise American Civil War because, even though it is not a true proper noun, it is consistently capitalised in sources (see here).
If we remove usually in the sentence at MILTERMS, it begs the question as to what is an accepted name of wars etc, since clearly, not all wars, battles etc are proper nouns. They are descriptive in nature, they take the definite article in prose and not all are consistently capitalised in sources. For the rest of this, you can read my reply to Chicdat below. Cinderella157 (talk) 11:15, 10 June 2025 (UTC)[reply]
  • Remove usually altogether – text was added without discussion by Dicklyon six years ago. I will copy my comment from a recent RM: The operative word here is "accepted" – thus, the event has an actual, accepted common name, not a descriptive name (e.g. American Civil War is accepted, War in Afghanistan is [descriptive]). This is putting into words common sense, something that has never really existed at MOS:CAPS. Accepted = proper name. Proper names are capitalized. Please find any grammar or style guide that contradicts that. 🐔 Chicdat  Bawk to me! 11:06, 7 June 2025 (UTC)[reply]
    Your attempted to remove usually but were reverted by another with the edit summary: ... if this text has been here for 6 years it has implicit consensus ... Consequently, your comment is not a surprise. Removing usually begs the question: what is an accepted name - but you already answer this question: Accepted equals proper name [equals sign won't render here]. Therefore, we capitalise names of wars etc if they are proper names. WP (MOS:CAPS) treats those names which are consistently capitalised in sources as a proper name (per the lead). Accepted equals proper name represents the spirit and intent of the subject sentence. As you note, the names given to wars, battles, revolutions etc are not all proper names and the names of articles using these terms are not always correctly capitalised. Without usually, there is no conflict between the subject sentence and the lead paragraph of MILTERMS or the general advice in the lead. If usually is understood as synonymous with consistently, there is no conflict either. Such an understanding is consistent with reading the definition of usually on balance and the evidence of linguistic studies. Arguments that usually creates a lower threshold for caps than the general advice is based on an aberrant meaning of usually (by taking one part of the definition in isolation rather than on balance) such that the subject sentence would be inconsistent with the general advice. As you have identified, accepted equals proper name, and such an argument is contrary to the spirit of the subject sentence as you have identify it. I see that adding usually affirms the consistency with the general advice and believe this was the intent of adding it. Perhaps Dicklyon can affirm this. With or without usually the intent of the subject sentence is to affirm the general guidance in the lead. Cinderella157 (talk) 01:57, 8 June 2025 (UTC)[reply]
  • Revert and remove "usually" per Chicdat, proper names are uppercased on Wikipedia. To lessen that obvious commonsense view, the word "usually" (which means 'most often') was added without discussion and has since been used to lowercase proper names. An easy fix to bring the guideline back the status of its original meaning. As for the meaning of the word "usually", the only objective term used in dictionary meanings is "most often", which asserts a majority, or the name most commonly used, and nothing more. Randy Kryn (talk) 11:43, 7 June 2025 (UTC)[reply]
  • Amending my statement, as people are actually saying "usually" doesn't mean what it means. Either "usually" is kept, which sets the standard of "most often" (i.e. either 50.1% or the name used more than any other) or the wording reverts to include all wars, battles, etc. "Usually" at least sets a bar for those who want to keep it, but it certainly doesn't mean "always" or "consistently", it means most often, and is maybe the best idea to use it for all title casings and not only MILTERMS. Randy Kryn (talk) 12:26, 10 June 2025 (UTC)[reply]
  • Consistent, per Cinderella157 and Dicklyon. Gawaon (talk) 07:37, 10 June 2025 (UTC)[reply]
  • Consistent based on reading the relevant sections of both policies. Seems pretty straightforward: follow abundant reliable sources.
MOS:CAPS: "Wikipedia relies on sources to determine what is conventionally capitalized; only words and phrases that are consistently capitalized in a substantial majority of independent, reliable sources are capitalized in Wikipedia."
MOS:MILTERMS: "[W]herever a military term is an accepted proper name, as indicated by consistent capitalization in sources, it should be capitalized."
As far as Cinderella157's supporting comment, WP:TLDR. Penguino35 (talk) 14:07, 26 June 2025 (UTC)[reply]
  • Consistent per above. There was never an agreement of intent to establish a lower threshold. That was a reinterpretation after the fact. And the word shouldn't be deleted, as the comments above show some desire for the absence of the word to be interpreted differently. —⁠ ⁠BarrelProof (talk) 21:26, 27 June 2025 (UTC)[reply]
  • Broken RfC, this RfC is about the word "usually", not about replacing it with another word. Replacing it for another word with a different meaning falls outside the scope of the RfC question. It's either remove it or keep it as is. Wikipedia should not be changing the meaning of a word which is defined as "most often", and "spirit and intent" language is strange wording with no basis in guidelines or policy. "Usually" means what sources say it means, "most often". Either keep it or remove it, but don't redefine it. Randy Kryn (talk) 02:35, 28 June 2025 (UTC)[reply]
    There's nobody who could define English words once and for all, that's not how languages work. Words get their sense from their usage, and the usage can vary over time, region, and users. Dictionaries can help a lot, though of course they too will not always agree. I don't know from which dictionary you drew your "most often" description, but in Wiktionary I find the descriptions "Most of the time; less than always, but more than occasionally" and "Under normal conditions". But where in the "less than always, but more than occasionally" range do we want it to fall in this case? Or what are "normal conditions" and when do they no longer apply? Those are reasonable questions for an RfC to ask and as I understand this RfC, it's meant to do essentially just that – clarify the indented meaning of an inherently somewhat vague and ambiguous word for this specific case. Gawaon (talk) 06:56, 28 June 2025 (UTC)[reply]
  • Bad RfC more or less per Randy Kryn. We have over the years developed an unfortunate habit of using words in ways that are different from and even contrary to their normal meaning though this double usage of words as terms-of-art is by no means limited to us. I even bear some small share of blame for that. I suppose in many cases it isn't that bad because frequency of usage and context allows people to figure out the intended meanings without too much difficulty. However we really want to avoid future occurrences even if it leads to somewhat dry technical language being employed.
    Thus it is logical to propose a rewording for clarity, or to remove a word, or even to remove the whole paragraph. If the intent here is to say that this is not an exception or special case then it shouldn't be there at all it is rather backwards to list something in an exceptions area only to say it is not an exception, please don't write guidelines that way. But what we should not be doing is having RfCs to redefine one specific instance of a word's appearance well unless you deliberately want to make projectspace even more confusing for new and casual editors.
    Assertions that we should draft imprecisely because semantic drift is inevitable are unconvincing and prove too much. If and when such shifts happen rewording can and will be done to maintain meaning, assuming practice doesn't shift, but we should strive to reduce ambiguity not create more of it. 184.152.65.118 (talk) 20:58, 12 July 2025 (UTC)[reply]
  • Consistent is the best available option, since it reduces the impact that an unnecessary specific rule is having on a useful general rule. Better options would be to rework MILTERMS more significantly, or make a small change like replacing "usually" with "consistently". I oppose removing "usually", and I see the unexplained clash between it and both the MILTERMS opener and CAPS more generally to be untenable. Firefangledfeathers (talk / contribs) 14:10, 16 July 2025 (UTC)[reply]
  • Remove usually per Chicdat's train of logic. If that means a "lower" standard, then so be it. SWATJester Shoot Blues, Tell VileRat! 22:07, 4 August 2025 (UTC)[reply]

Edit to MOS:GEOCAPS

[edit]

Randy Kryn, in this edit you changed the sentence of GEOCAPS from

Names such as Japan, the Nile, New York, Buenos Aires, and Tierra del Fuego are treated as proper names and take an initial capital letter on all major elements.

to

These are treated like other proper names and take an initial capital letter on all major elements: Japan, the Nile, New York, Buenos Aires, and Tierra del Fuego are treated as proper names and take an initial capital letter on all major elements.

With the edit summary:

added back the removed portion: Geographical or place names are the nouns used to refer to specificPlace names, added back the long-term and important, and for some reason recently removed, language: "These are treated like other proper names and take an initial capital letter on all major elements: ". I haven't looked when this was done, but please refrain from removing this or any other essential guideline language, thanks. [sic]

It was I that made some amendments to this sentence between this edit, with the edit summary:

Resolving an inconsistency between this and Wikipedia:Naming conventions (geographic names). Also resolving that this section does not create a specific exception to the general advice at the lead. The section, Wikipedia:Manual of Style/Capital letters#Proper names (of which GEOCAPS is part) was merged into MOS:CAPS. Wikipedia:Manual of Style/Capital letters tells us to cap proper names but doe not tell us what a proper name is. For this, we defer to usage per lead

and this edit. You participated in the changes to this sentence with this edit. The changes were made in consequence of a discussion with ModernDayTrilobite at User talk:ModernDayTrilobite#Close at Talk:Galactic Center#Requested move 21 March 2025 in their capacity as closer (noting they subsequently reverted their close) - particularly this reply by them:

You ask what in GEOCAPS makes the distinction clear, but honestly, I don't think there is a clear distinction drawn for these particular cases; we have to evaluate the evidence on a case-by-case basis to gauge whether there's a proper noun or not.

This was in response to this question:

If I said that I am going to: Japan; Mount Everest; the Gulf of Tonkin, Boston (which could be one of two dozen odd places); or, the town hall, the capital city or the savannah; in each case, I am referring to and going to a definite specific place. In the latter, this is because of the definite article (the). How is it (why does) the galactic centre (of the Milky Way) fall to the former examples and not the latter? What in GEOCAPS makes the distinction clear?

You were a party to both the amendment of this sentence and to the discussion giving rise to it. Your revert at this time (near two months later) has the appearance of being quite disingenuous. Cinderella157 (talk) 04:54, 8 June 2025 (UTC)[reply]

I was not aware that what had been removed was that different than what you replaced it with (having never memorized the language). In any case, I added back some of the language that was removed, it was quickly reverted, and your preferred wording was back in place when you wrote the above. The point was to assure that other editors know that the names of places are considered proper names, which is less clear in the present (as compared to the long-term) wording. Randy Kryn (talk) 10:25, 8 June 2025 (UTC)[reply]
Randy Kryn, presumably, you became aware of my initial edit because you watch MOS:CAPS and saw the diff that clearly identifies the changes. My edit summary gave a clear statement - ie that the language (if seen as a lesser threshold for capitalisation) was inconsistent with both the general guidance at MOS:CAPS and Wikipedia:Naming conventions (geographic names). The edit where I wrote If I said that I am going to: Japan; Mount Everest ... is time stamped 10:01, 15 April 2025. My edit initial edit to GEOCAPS is time stamped 02:40, 18 April 2025. Where you say your preferred wording was back in place when you wrote the above would be incorrect? ModernDayTrilobite's response quoted above (You ask what in GEOCAPS makes the distinction clear, but honestly ...) was made in respect to GEOCAPS before my edits. It is time stamped 14:42, 15 April 2025. Reverting my edits for the primary purpose of maintaining a loophole against the underlying principles of the guidance (at both MOS:CAPS and WP:NCGN) would appear to be inappropriate. Is there a substantive reason to challenge the rationale and implementation of my edits or can we consider this resolved? Cinderella157 (talk) 11:32, 23 June 2025 (UTC) Amended duplicated link target to separate targets. Cinderella157 (talk) 03:14, 24 June 2025 (UTC) [reply]
Cinderella157, on a quick reading I probably missed the implication that you may have been trying to achieve: to lessen the language that all places and geographical locations are proper names (which is what the long-term wording showed). As long as that understanding is still in place and clarified, fine. If not, change it back to the long-term language, thanks. Randy Kryn (talk) 11:39, 23 June 2025 (UTC)[reply]

"These are treated like other proper names and take an initial capital letter on all major elements: ... are treated as proper names and ..." is redundant and repetitive. That's reason enough to revert this. But to the extent that this seems like an attempt to (or a with an effect of) bypassing the test at the top of MOS:CAPS (only words and phrases that are consistently capitalized in a substantial majority of independent, reliable sources are capitalized in Wikipedia), by trying to declare any appellation applied to something geographical to "be" a "proper name" categorically, it should also be reverted, as not just lacking consensus but directly contrary to long-established consensus encapsulated in the lead of MOS:CAPS and represented in about 20 years of RM results. We base these determinations on source usage not on someone's personal selection of what "proper name" should mean from among the dozens of competing defintions pulled from philosophy and linguistics pundits (see WP:PNPN).  — SMcCandlish ¢ 😼  20:13, 16 June 2025 (UTC)[reply]

  • Randy, my edit was intended as a subtle hint that pursing the point was probably not a good idea. You would interpret the former version of GEOCAPS (ie before my initial edit) as requiring capitalisation for any place that can be described as a particular place and have argued this in RMs (all places and geographical locations are proper names).[3] Per the post close discussion with MDT the guidance does not make a clear distinction as to what geographical names should be capitalised just because they have a specific referent but these must be assessed on a case by case basis - ie we are back to the general guidance to determine what should be capitalised.

You appear to be arguing that GEOCAPS is a loophole to go against the underlying principles of MOS:CAPS and WP:NCGN. This would appear to be against the spirit of the pertinent guidance and could reasonably be seen as falling to Wikipedia:Wikilawyering and Wikipedia:Gaming the system. Opposing an edit which would clarify the guidance (in accordance with the underlying principles) for the primary purpose of maintaining a loophole that can be exploited to game the system only compounds this. This is not a good faith reason to challenge the rationale and implementation of my edits. Furthermore, you have alleged that I edited GEOCAPS before MDT's comment that I rely upon rather than relying upon MDT's comment to support the change. That is clearly a false allegation.

Your novel interpretations of P&G can be amusing but ... Can we consider this resolved?

(Cinderella157's post moved from my talk page to this main discussion) I don't really know what comment came first, many of your comments are tl;dr so you shouldn't be surprised that acting on those discussions may be the first that some editors are aware of them. In this case MOS:GEOCAPS used to be very clear about the casing of named places and geographical features: "Geographical or place names are the nouns used to refer to specific places and geographic features. These are treated like other proper names and take an initial capital letter on all major elements: Japan, Mount Everest, Gulf of Tonkin." I don't know how it can be clearer, so forgive my confusion that it really didn't mean what it said about proper names. Randy Kryn (talk) 04:01, 24 June 2025 (UTC)[reply]

Request for Comments on what is a proper name

[edit]

What is a proper name and how do you source it? There have been questions about what a proper name is and how it is sourced. There seem to be different problems with objects, actions, and maybe ideas. I am going to start with "objects" here, others can expand to actions.

Simply, could a manufacturer create a name for an object they make using their choice of letters and generic words in either upper or lower case? What Reliable Source is needed to show what the object's proper name is?

  • Possible spelling variations (Acme is the company): acme shovel - Acme shovel - Acme Shovel - Acme SHOVEL - Acme ShoVel.
  • Possible sources: - Name on object - Name in company document - Name in government document - Name in sources used - Name in potential sources - Name used most often (Commonname).

To be clear, I am not asking about MOS grammar, or when to use a proper name, only what a proper name is.

Thank you to anyone who shows up. Sammy D III (talk) 21:29, 10 June 2025 (UTC)[reply]

@The ed17: I did RFCBEFORE, I thought I had it. If it's clearer now could you replace tag? I don't want to myself. Sammy D III (talk) 22:29, 10 June 2025 (UTC)[reply]
@Sammy D III: I'm sorry, but I'm still confused. As Rally Wonk says above, there is plenty of info available in reliable sources to answer the question. RfCs are for dispute resolution; you would use it here to clarify part of MOS:CAPS or propose a change to it.
So, do you think MOS:PROPER needs a definition of the term instead of linking out to our article? If so, you should workshop a RfC intro and then propose it. Ed [talk] [OMT] 02:02, 11 June 2025 (UTC)[reply]
@The ed17: [4][5][6][7][8] + other boards and several editors. I thought proper names were important and that I could help with outside eyes. Feel free to close this, I'm not going to bother any more. Sammy D III (talk) 02:23, 11 June 2025 (UTC)[reply]
Thanks for the links. Given those wider questions, it does seem like you think MOS:PROPER needs better guidance on what article titles qualify as proper names. That would be a potentially good topic for a RfC. Ed [talk] [OMT] 03:04, 11 June 2025 (UTC)[reply]
Getting back, I'm in way over my head. I just really screwed up a RfC on that very subject. Can't wait for that to be archived. Thanks for talking. Sammy D III (talk) 13:39, 12 June 2025 (UTC)[reply]
(edit add): Oh, that's this one! So much is going on. Sammy D III (talk) 13:42, 12 June 2025 (UTC)[reply]
I'm not thinking about the real world, I'm talking about inside Wikipedia. Look at the discussions above. There are multiple arguments going on right now, I could link but I don't think we disagree? Sammy D III (talk) 22:42, 10 June 2025 (UTC)[reply]
There's far too much text on this page to digest, I'm not going to read through it all to try to understand what you are trying to say. Please don't start a new discussion and invite people from projects to attract them, to send them to another discussion 'somewhere on the page'. If you are already involved in other ongoing discussions, this one is needless.
You asked specifically not about MOS, grammar or when to use a proper name, but just what a proper name is. I answered that, Wikipedia is still the real world, definitions don't change! Post an example, specific question and maybe link to a discussion or close this one.
Last thoughts; as far as I am aware, placing uppercase anywhere does not change the spelling of a word. It's hard to say without knowing what the sources are saying "Acme Shovel" is. If Acme is the make and Shovel the product, than it's very probably not a proper name but two names together. Unless it's an IP'd name in full, which can be checked online in many jurisdictions, but then it is still a maybe. There's no catch all answer based on your example. Rally Wonk (talk) 23:36, 10 June 2025 (UTC)[reply]
I did not mean for you to read them, just to look and know that people are arguing and we need outside eyes. I'm really dirty myself. I want uninvolved editors to come to this discussion and talk about names, there is so much conflict up there and on many other pages. The names are meant as neutral examples of what could be, I thought "Acme" was a recognizable fictitious company name that sold shovels. The sources are clear, correct? You are uninvolved and you did talk, maybe I can get it going. Ideas? Sammy D III (talk) 23:54, 10 June 2025 (UTC)[reply]
Consider Wikipedia talk:Manual of Style/Capital letters/Archive 42 § Proper name vs. "consistently capitalized" to perhaps narrow the scope of the question. Good luck. —Bagumba (talk) 02:53, 11 June 2025 (UTC)[reply]
To be clear, I am not asking about MOS grammar, or when to use a proper name, only what a proper name is: I glanced over this before, but if this is not MOS related, you can refer to the Proper noun page, which you can improve, if needed. —Bagumba (talk) 04:25, 11 June 2025 (UTC)[reply]
  • A proper noun is a noun that refers to a unique/specific person (Abraham Lincoln), place (Djibouti), event (Coachella Festival). Regarding the last example, if I say "Coachella Festival" then it is clear that I mean Coachella Festival, but if I say "Coachella festival" then I could be referring to any festival that happens in Coachella Valley. While the use of the/a/an can show the distinction, English grammar does not strictly require the use of an article to show specificity. We don't have to say "The Abraham Lincoln", although we can in order to emphasise it, as in The Abraham Lincoln. TurboSuperA+(connect) 05:42, 11 June 2025 (UTC)[reply]
Please explain this to WP:SHIPS. Proper nouns are allowed to use the definite article there. For example "the Abraham Lincoln sailed into port yesterday". Or "the Neptune was launched on XX September." Llammakey (talk) 11:49, 11 June 2025 (UTC)[reply]

Maybe this will be clearer (I didn't want to point at anything specific): Example: A RM on Five freedoms, a UK program, is here. The question is if a government program is a proper name. The editors don't matter, try the reasoning. Sammy D III (talk) 13:23, 11 June 2025 (UTC)[reply]

Are you suggesting that government programs always have proper names? And that the "five freedoms" topic is about a government program? I didn't see that there. It's certainly not what the article lead says ("The Five Freedoms,[1] sometimes referred to as the Five Freedoms model,[2] is a framework for assessing animal welfare that uses an outline of five aspects.[1]"). Dicklyon (talk) 23:09, 11 June 2025 (UTC)[reply]
  • For WP purposes, because MOS:PN is a subsection of MOS:CAPS, "proper names" are those which pass the opening test of MOS:CAPS: words and phrases that are consistently capitalized in a substantial majority of independent, reliable sources (emphasis in original), and which are not capitalized for some special reason covered by another guideline, e.g. being acronyms/initialisms or measurement-unit symbols. This is by design. MoS and WP are entirely bypassing the endless 200+ year debate about what "proper name" means. For a summary of that unworkable shitshow, see WP:PNPN. Most capitalization-related dispute on WP would stop dead if people would stop trying to import philosophy-ased "proper name" definitions into RM arguments; they have absolutely nothing to do with typography, and even less to do with WP's policies and guidelines.  — SMcCandlish ¢ 😼  20:06, 16 June 2025 (UTC)[reply]
  • If one wishes to know what a proper name is and how it is defined, then Proper name is a good place to begin. An issue is that many have a perception that is simplistic and incomplete (per the fuller criteria given at proper name. As a rule, we capitalise (use title case for) business names, the names of institutions, proprietary names and the titles of works. If there is an issue with proper name, it would group all of these things under a common banner of "proper name", where these are "rules" in their own right. In doing so, I can see some inconsistencies in proper name. If one wishes to use a definitional or rule based approach to what is capped on WP rather than an empirical approach as we presently have, be careful what you wish for. Such an approach would render capitalisation much closer to the "rules" of French, where only proper names, business names, names of institutions and proprietary names are capitalised but not descriptive names (the descriptive parts of names) such as in the American Civil War or the Mississippi River.Cinderella157 (talk) 22:24, 23 June 2025 (UTC)[reply]

It is time we talked about Google Ngram

[edit]

Google Ngram has been used uncritically to justify lowercasing articles. Two examples from [9]:

As you can see, there is a clear preference for uppercase in the last ~40 years (for Vicksburg) and last ~70 years for Gettysburg. Let's recap what WP:RS says: Wikipedia articles should be based mainly on reliable secondary sources, and Although specific facts may be taken from primary sources, secondary sources that present the same material are preferred.

I propose that we introduce a time limit/cutoff for Google Ngram results. Something like "Google Ngram results from the last n years should be considered in lowercase rename discussions". This is how the results would look like if the cutoff was:

I am well aware that 75 and 30 years are arbitrary, but I think a cutoff is necessary to eliminate primary sources and sources close to the event when the event did not yet have a proper name, since proper names arise from use. There needs to be some way to judge recent and contemporary capitalisation, because Wikipedia should reflect current usage, not how people spelled it back in the 19th century.

Furthermore, Google Ngram results show word use, but they do not show the context how the word was used. For Vicksburg (Ngram singular, Ngram plural) we aren't sure if "Vicksburg campaign" is referring to the topic of the Wikipedia article or is it talking about a Vicksburg campaign during the American Civil War. Therefore Google Ngram results cannot and should not be the be-all and end-all of lowercase discussions. TurboSuperA+(connect) 17:44, 11 June 2025 (UTC)[reply]

I don't think anyone has ever suggested that Google Ngram results can and should be the be-all and end-all of lowercase discussions. Gawaon (talk) 18:02, 11 June 2025 (UTC)[reply]
And yet... Talk:Gettysburg campaign#Reverting move TurboSuperA+(connect) 19:28, 11 June 2025 (UTC)[reply]
I did explain to MarcusBritish there how he messed up on his attempt to get n-gram stats in context there (he lowercased Gettysburg when he lowercased campaign, so it's no wonder he fooled himself with n-gram stats). Dicklyon (talk) 22:47, 11 June 2025 (UTC)[reply]
Have you encountered Dicklyon in such discussions? Its their first piece of evidence that they cite to support lowercasing. Andy Dingley (talk) 21:41, 11 June 2025 (UTC)[reply]
Yes, quite often it is, when the phrase is common enough to get some useful stats. Typically then, or when there are no n-gram stats for a less common phrase, I look at book/news/scholar hits to get more feeling for usage in titles vs sentences and such; it's harder to present summary stats on those. Dicklyon (talk) 22:47, 11 June 2025 (UTC)[reply]
You are right that ngrams should never be used uncritically, and in the discussions I've been involved in they've just been part of an argument. They are a flawed but useful way of determining what is usually capitalized. I'd not set any hard and fast rules, but the kind of points you raise would add to the information needed to make a decision. SchreiberBike | ⌨  18:37, 11 June 2025 (UTC)[reply]
I do not see how a ngram has any bearing on the proper name of an object. Sammy D III (talk) 19:33, 11 June 2025 (UTC)[reply]
I'll pretend you're serious, and try to explain. The n-grams summarize the frequencies of usage of phrases in books. Proper names show up with an overwhelming majority of capitalized uses, unless there's also a generic phrase using the same words, which does happen now and then. For example, if "Gettysburg Campaign" and "Gettyburg campaign" meant different things, one being a proper name and one not, you'd need to look elsewhere to discover that. So click through to some of the book links and see. If both forms appears to always refer to the same campaign, then you can conclude that the name of that campaign is not consistently capitalized in sources. Then, using WP's criterion for what to treat as a proper name, you'd have to conclude in the negative for that one. Of course, there are authors who say it is, and treat it as a proper name, capitalizing it. That's OK, just not what we do on WP. Dicklyon (talk) 22:55, 11 June 2025 (UTC)[reply]
I'd like to point out to outsiders that I don't talk to dick lyon (if he won't capitalize proper names then I won't capitalize his). This is an example of why not. I specifically said "of an object" and he ignored it, and instead insulted me. I know what an ngram is, that is the point of my post, but instead he changes to campaigns, which he knows I don't care about. I address him on one subject and he insults me while trying to change the question. Typical for him, if he is unable to address a direct question about a specific point he will attack someone else. Sammy D III (talk) 23:32, 11 June 2025 (UTC)[reply]
Could you refrain from personal attacks? Gawaon (talk) 07:18, 12 June 2025 (UTC)[reply]
  • if "Gettysburg Campaign" and "Gettyburg campaign" meant different things,
But they did, and you still insisted that 'gettysburg campaign' had to be used, which trashed the article and merged two related but distinct articles into one mess. I refer of course to the Royal Navy small boats of the mid-20th century. A Motor Torpedo Boat and a Motor Gun Boat are, of course, examples of the generic types of motor torpedo boat and a motor gun boat found in most navies at that time. But the point is that the RN named the specific, distinct and British classes within that generic type in the capitalised form. Much as the USN had a similar Patrol Boat, Riverine, capitalised. You'd never have the nerve to try for patrol boat, riverine because WP is dominated by US editors and you know how many Vietnam vets that would include.
Andy, I'm here to learn, too. Please show me the different meanings of "Gettysburg Campaign" and "Gettyburg campaign" that you have identified. Dicklyon (talk) 17:24, 16 June 2025 (UTC)[reply]
Then you went and did the same to Coastal Motor Boats, Steam Gun Boats and Harbour Defence Motor Launches, which are all RN neologisms which have only existed correctly in the capitalised form, because there is no corresponding generic type under any related name. Note that the Victorian steam gunboats (never capitalised) bear no relation to the WWII handful of SGBs, but that you threw all of them into your ngram melting point regardless and pulled out the answer you wanted.
But you're not interested in accuracy. All that matters is MOS dogma and getting rid of capitalisation regardless (we can read your edit history for ourselves, there's little point in denying it). Andy Dingley (talk) 00:07, 12 June 2025 (UTC)[reply]
I don't think ngram is reliable enough for arguing for or against capitalisation, only a particular string. You may have seen my comment on Five freedoms. Rally Wonk (talk) 21:29, 11 June 2025 (UTC)[reply]
Sometimes the n-grams are pretty thin and hard to interpret. But did you look at the one I posted at that RM, with plenty of context? here it is again. Dicklyon (talk) 22:58, 11 June 2025 (UTC)[reply]
Are you able to add a trendline for "Five Freedoms of animal"? Rally Wonk (talk) 08:59, 12 June 2025 (UTC)[reply]
No, sadly, that one does not occur in enough books to show up. Dicklyon (talk) 21:47, 16 June 2025 (UTC)[reply]
They were not supportive of ngrams' reliability as a source. Andy Dingley (talk) 21:42, 11 June 2025 (UTC)[reply]
Folks are free to read the discussion for themselves, which I participated in. Your summary accurate if we are being quite literal, but is potentially misleading in the context of this discussion. --MYCETEAE 🍄‍🟫—talk 23:25, 12 June 2025 (UTC)[reply]
  • I can see very little use for ngrams here, and certainly not as the irrefutable proof that they're almost always presented as, in every debate. Google's N-grams are out of context and English is a language that doesn't capitalise nouns except exceptionally; so every irrelevant phrase, or every bit of careless copywriting, that Google captures is always going to dilute any evidence to support capitalisation. We should not use, or at least should not place so much reliance on, a measurement technique with such an inherent bias.
This is also related to the phrasing. When we require sourcing to be "consistently capitalised", that is an impossible burden in a context that is inherently inconsistent and somewhat unreliable. We should be looking for a standard instead that is "authoritatively" or "robustly" capitalised: i.e. if we find reliable, robust, authoritative or the original neologising source, then we should place much more weight on that than we do on the sheer bulk of whatever Google's web crawler dug through. Andy Dingley (talk) 21:56, 11 June 2025 (UTC)[reply]
We should not use n-grams - except when it agrees with the way I think something should be capitalized! Then it is definitive! Blueboar (talk) 22:09, 11 June 2025 (UTC)[reply]
The problem with Google Ngrams is that you cannot see the context, so you don't know why a word was capitalised or lowercased. The names of campaigns are proper nouns, and are therefore always capitalised in English. But I agree with Adam Dingley: "consistently capitalised" is an absurd standard, and privileges advertising copy and misspellings. We should just adopt a standard and stick to it. Hawkeye7 (discuss) 22:18, 11 June 2025 (UTC)[reply]
For contexts that are common enough, you can use n-grams up to 5-grams. You can find extensions using a "*" at front or end to see what contexts are common enough to be in the counts. This doesn't always get you what you want, but often it does. See example in below subsection. Dicklyon (talk) 22:40, 11 June 2025 (UTC)[reply]
NGrams also bias (obviously) to content that's online, and the internet is notoriously illiterate when it comes to proper format and capitalization. And Ngram proponents will typically use them to push RS out of the way, along with actual specialist literature. They don't really "prove" anything aside from how often a word or phrase might occur online. Not that it matters... Intothatdarkness 15:10, 12 June 2025 (UTC)[reply]
Unless I'm mistaken, Google Ngrams is based on Google Books, not on online content (websites). Gawaon (talk) 16:34, 12 June 2025 (UTC)[reply]
It would also be quite amazing if there had already been that much online content in the 1850s or so. Gawaon (talk) 16:35, 12 June 2025 (UTC)[reply]
Right, books are much more representative of reliable sources than general content is. Unfortunately, in recent years you've got an awful lot of books that are cobbled from Wikipedia and/or influenced by Wikipedia, so the stats start to get wonky. New books are ingested automatically without scanning in most cases, which makes it way too easy to get a junky bias. Stats before about 2008 are more like reliable sources. Dicklyon (talk) 06:04, 14 June 2025 (UTC)[reply]
I've seen both used. And they're still random scattershots. However, some will still swear by them. Intothatdarkness 17:03, 12 June 2025 (UTC)[reply]
My 2¢: I think there is a slight over-reliance on Ngrams and other "Google tests" because they are so readily available. But I most of the discussion is rather nuanced, incorporating multiple Ngrams and other source analysis. Cases that are swayed by a single Ngram usually have low participation, mainly by editors less well-versed in their assessment, and/or have dramatic, uncontroversial results. I think it is valuable for all of us to learn more about when and how to use Ngram and think critically about the results in context. See Dicklyon's tips below, for starters. --MYCETEAE 🍄‍🟫—talk 00:13, 13 June 2025 (UTC)[reply]
  • The most important n-grams are those that are recent and show a trend. No reason to enact a timeframe (75 years? Last five years is more like it). Common name is the name that is most common now, not decades ago. Randy Kryn (talk) 14:06, 16 June 2025 (UTC)[reply]

Tips for using n-grams

[edit]

Typically, popular topics will appear in lots of titles, headings, list entries, and such, where use of title case is common, which biases the capped numbers upwards, sometimes a lot. To get a better idea of usage in sentences (which is what WP:NCCAPS and MOS:CAPS talk about), you can try different contexts that are less likely to appear in titles and lists. For example, put lowercase "the" in front. E.g. limiting to 70s years for the two campaigns above:

shows that usage has been about 50-50 through the turn of the century, with more lowercase earlier, and with a small uptick in caps very recently (likely influenced by WP's overcapping). There are other ways to explore contexts. Dicklyon (talk) 22:20, 11 June 2025 (UTC)[reply]

Here you can see lots of contexts by using a "*" at the end:

Here one can see that the most common context for the capitalized uses of Vicksburg Campaign are in the context of "the Vicksburg Campaign Trail", which I presume is a proper name, or part of one. Clicking through to books, you see that "Vicksburg Campaign Trail" is indeed a proper name (that is, it's consistently capitalized in sources). You can add words before or after, start or end with a "*", and things like that to get a better idea what's going on. If you don't do any of this, and just show the simplest n-grams like this section started with, it's not going to tell much of the story. Dicklyon (talk) 22:26, 11 June 2025 (UTC)[reply]

Note also that if you take a phrase that's well known to be accepted as a proper name, the n-grams look like this:

That is, "Boston Tea Party" is confirmed to be a proper name, because it's consistently capitalized in sources (with only a few percent exceptions, especially in the last half century). Dicklyon (talk) 22:35, 11 June 2025 (UTC)[reply]

Normandy Campaign *, Normandy campaign * Hawkeye7 (discuss) 00:47, 12 June 2025 (UTC)[reply]
Yes, good one. It shows plenty of caps recently for "Campaign" in the context of "Normandy Campaign 1944", and not so much otherwise. If you click through to books you can get an idea where many of those hits come from (looks like mostly due to book titles from 1992 and 2006 and 2012 and such, and citations to books and reports. Looking a little further, see Normandy Campaign 1944 * or * Normandy Campaign 1944. And further, and books. Dicklyon (talk) 02:53, 12 June 2025 (UTC)[reply]
Nice! Thank you. SchreiberBike | ⌨  13:28, 12 June 2025 (UTC)[reply]

Consistent capitalisation

[edit]

Having thought about it some more, I believe that the word "consistently" is being misinterpreted. Some editors think that "consistent" means "in the majority of sources", but that isn't true. The dictionary definition [10] [11] says nothing about frequency. Consistently means what it has always meant -- continuous use.

For example, the ngram for "Abbasid Revolution" shows consistent use. We can speculate on why it is lowercased in some (most) cases, but that's irrelevant. The existence of lowercase spelling does not negate that the term is consistently capitalised. TurboSuperA+(connect) 07:25, 12 June 2025 (UTC)[reply]

It's more than "consistently"; MOS:CAPS says "... consistently capitalized in a substantial majority ..." —Bagumba (talk) 09:55, 12 June 2025 (UTC)[reply]
What about this part, right at the top of the page: capitalization is primarily needed for proper names?
This is what's confusing me: are we using "consistently capitalised in a substantial majority..." to determine what is and isn't a proper noun, or are proper nouns exempt from the requirement and the requirement is for capitalisation of terms other than proper nouns? TurboSuperA+(connect) 10:20, 12 June 2025 (UTC)[reply]
Presumably the former. If something is consistently capitalized in most sources, that's a strong indication that it's indeed a proper noun/name. Gawaon (talk) 10:25, 12 June 2025 (UTC)[reply]
But then we're ignoring the accepted definition of a proper noun: A proper noun is a noun that identifies a single entity and is used to refer to that entity. TurboSuperA+(connect) 10:36, 12 June 2025 (UTC)[reply]
If one reads proper noun more fully, it also says: A proper name may appear to have a descriptive meaning, even though it does not. Abbasid revolution is inherently descriptive - a revolution that occurred in Abbasid (the Abbasid Caliphate). Cinderella157 (talk) 09:53, 23 June 2025 (UTC)[reply]
It's not as simple as that: "my sister's bike" is a noun or rather noun phrase that identifies a single entity and is used to refer to that entity (assuming I have just one sister and she has just one bike), but that doesn't make it a proper noun and it would be odd to capitalize it. Gawaon (talk) 11:03, 12 June 2025 (UTC)[reply]
No, that's not how it works. "Sister's" is a possessive noun that shows ownership, while "bike" is a common noun. The only time you capitalise a possessive noun is if the noun is otherwise a proper noun, e.g. "my sister Sarah's bike". TurboSuperA+(connect) 11:11, 12 June 2025 (UTC)[reply]
But if so, wouldn't that mean that the Statue of Liberty had to be lower-cased and the University of Oxford would require a lower-case u, since statue, liberty and university are all common nouns? Gawaon (talk) 13:22, 12 June 2025 (UTC)[reply]
No, because "Statue of Liberty" is the name of the monument in New York, and therefore a proper noun. TurboSuperA+(connect) 13:34, 12 June 2025 (UTC)[reply]
Now we're back where we started. Why is the Statue of Liberty a name = proper noun, while the Bike of My Sister is not? Both refer to single, specific entities. Gawaon (talk) 14:02, 12 June 2025 (UTC)[reply]
Because your sister might have three bikes, or seven, while there's only one Statue of Liberty. You assume your sister only has one bike (if you want to push the point), while we don't have to assume that there is one Statue of Liberty in New York, we know what we are talking about when we say Statue of Liberty. Furthermore, "your sister's bike" is a description, "owned by your sister" is a quality of a bike, not its name. TurboSuperA+(connect) 14:08, 12 June 2025 (UTC)[reply]
But there is only one Sewer cover in front of Greg L’s house; I recall there was a discussion about treating it as a proper name. Dicklyon (talk) 16:39, 14 June 2025 (UTC)[reply]
We have a common noun "sewer cover", and there are tens of thousands of sewer covers like that. "Being in front of Greg L's house" is a temporary quality that particular sewer cover has. That sewer cover can be picked up and placed in front of Dick L's house, and then that quality is changed to "being in front of Dick L's house" and we can refer to it as a sewer cover in front of Greg L's house. A name is an immutable quality. Going back to the Statue of Liberty example, if that Statue of Liberty was moved to Toronto, it would still be the Statue of Liberty.
Another example is the Berlin Wall. People have pieces of the Berlin Wall at home in the United States. Those pieces were taken from Berlin, moved across an ocean and put into someone's house, yet the name Berlin Wall has not changed, it retained the name so that little piece can be referred to as a piece of the Berlin Wall. "Being in front of Greg L's house" is not special, there can be many things with that description/quality, concrete, asphalt, a car, blades of grass, a fence, a postman, a dog on a walk, a stray cat, a bird... these can all be referred to and described as "being in front of Greg L's house". But there is only one Berlin Wall and there cannot be any more. You cannot take a random brick and say "This is the Berlin Wall" or "This is a piece of the Berlin Wall", but if you put that brick in front of Greg L's house it will become a "brick in front of Greg L's house" (and can be described that way as long as it remains in front of Greg L's house). If you move it in front of your house, it will become a "brick in front of Dick L's house".
So I don't find those kinds of examples convincing, because while it may seem like a "gotcha", it doesn't hold water and doesn't survive even the slightest scrutiny. TurboSuperA+(connect) 16:58, 14 June 2025 (UTC)[reply]
Of course it's not convincing, it's just a silly example of how trying to decide proper name status by logic doesn't work. I know "Liberty" can be the proper name of an imaginary person, and a statue of Liberty has been made more than once, and that's a descriptive phrase. But the one in New York has earned he proper name "Statue of Liberty", not by any logic but because that's what everyone calls it (at first it was only about 60% capped, as many were treating it as descriptive, but that changed). There are other "statue of X" topics that have not earned proper name status, even if they are unique. That's why we look to sources to see how a name is treated. Dicklyon (talk) 18:07, 14 June 2025 (UTC)[reply]
But which sources? Google Ngram looks at all books available online, many of them self-published, many with questionable editorial oversight. It's hard to tell where the result comes from. I'm partial to Sammy D's suggestion below that we go by the sources used/cited in the article, because presumably they are all reputable sources. TurboSuperA+(connect) 18:25, 14 June 2025 (UTC)[reply]
I think this is a better way forward than relying on whatever a Ngram happens to spit out. Using article sources allows for more unform verification and as you point out are more likely to be RS and come from specialist literature. Intothatdarkness 12:30, 16 June 2025 (UTC)[reply]
There was a related thread Wikipedia talk:Manual of Style/Capital letters/Archive 42 § Proper name vs. "consistently capitalized". —Bagumba (talk) 11:23, 12 June 2025 (UTC)[reply]
Perhaps we should adopt a reputable organisation's style guide? For example, here's Oxford University's style guide(pgs.4-6 deal with capitalisation) Here is a pertinent example:
division: Capitalise only when used as part of the title of a division, not when referring to a division without using its full name.
example 1 (no cap): There are four academic divisions of the University: Humanities, Mathematical, Physical and Life Sciences, Medical Sciences and Social Sciences.
example 2 (cap): The Medical Sciences Division is based mainly in Headington. The division’s head is Alastair Buchan.
If we apply the same reasoning to MILHIST terms, lets say naming a campaign, then we'd get:
(no cap): The general led a successful military campaign at Gettysburg.
(cap): The Gettysburg Campaign was a successful military operation. The campaign's leaders were satisfied with the results.
The general rule from Oxford's style guide actually states: Do not use a capital letter unless it is absolutely required. Even when an organisation's style guide is conservative and encourages lowercase use when possible, it still says that words should be capitalised if they are part of the full name. TurboSuperA+(connect) 12:10, 12 June 2025 (UTC)[reply]
Interestingly, that general rule seems to be fairly similar to our own. Gawaon (talk) 13:26, 12 June 2025 (UTC)[reply]
It looks like we are using the general rule for proper names? Sammy D III (talk) 13:51, 12 June 2025 (UTC)[reply]
The problem is that "absolutely required" is still unclear. No-one is trying to scatter-gun capitalisation around like a Victorian jobbing printer. Andy Dingley (talk) 11:10, 13 June 2025 (UTC)[reply]
In practice, both their and our own rules boil down to "when in doubt, don't capitalize", which seems reasonable enough as a rule of thumb. Gawaon (talk) 11:27, 13 June 2025 (UTC)[reply]
I don't think so. I cannot find anything in that particular source where it says that the definition of a proper noun is unclear or in doubt. TurboSuperA+(connect) 11:32, 13 June 2025 (UTC)[reply]
The "absolutely required" means that names and other proper nouns should be capitalised. I cannot find a single instance in any source where they say that the definition of a proper noun is debatable (and not in a philosophical sense). If there is one, I would love to see it. TurboSuperA+(connect) 11:29, 13 June 2025 (UTC)[reply]
  • This is Wikipedia. Everything is debatable.
I've seen two main differences between mine and 'the Dicklyon standpoint'. Both of us might well be said to agree that 'names and other proper nouns should be capitalised'. The differences are then: what is a name or proper noun? and also what is the metric that allows us to judge this? Repeating the uncontentious statement doesn't advance the debate by much.
Mostly I've taken my own metric on this to be based around an idea of "authoritatively" or "robustly" capitalised: i.e. if we find reliable, robust, authoritative or the original neologising source, we follow their lead. In contrast, Dicklyon's reliance on Google Ngrams is a faith in 'the wisdom of crowds': The more sources, of any provenance, mixed into the result, the better. The trouble is though that that approach favours populism, bulk over quality, and general behaviour of English language will inevitably favour a lowercase answer. It's a bad metric and just shouldn't be treated as WP:RS.
Sammy mentioned military shovels back up the page. The 'Shovel, Entrenching, Squaddies for the use of' would be a classic format. But this isn't a proper noun, because the army didn't invent shovels. If though, they did invent a specific and novel device, such as MLRS, then I'm happy to accept that as a proper noun phrase and capitalise it. Even when we also have a distinct article at multiple launch rocket system on their generic form. Andy Dingley (talk) 12:41, 13 June 2025 (UTC)[reply]
I'd think that was partly the country. In the US I can show source for proper names being descriptive. They will be used throughout the manual as descriptive except when being identified by proper name, which is probably once or twice. The changing in the text is no problem for me (I used to do US Army trucks), they are descriptive, but I'd like the proper name in the title. These people have run through "my" stuff like locusts, it's only personal taste. It annoys me by who does it, but they seem to leave enough title, i.e. "M" numbers, to be just fine. Hell, everything I've ever done could use a good going over by an 8th grader. Just not a fanatical one.
(edit add) Reference shovels, Rolls Royce didn't invent either silver or ghosts. They took two generic words and used them as a proper name. Popularity was not an issue when they first named it (or whatever their first car was) and nobody else recognized even the manufacturer, much less the model. When "Silver Ghost" is upper-cased isn't it a name?
However, I am generally talking about objects. Battles and such look pretty Commonname from my POV. Sammy D III (talk) 13:26, 13 June 2025 (UTC)[reply]
Indeed, most battles throughout history never got a proper name, but some did. Only a tiny minority was considered important enough and remembered widely enough to get one. So how do we know which ones got one (say the Battle of Waterloo), and which ones got just generic references (say the siege of Jerusalem (70 CE))? Well, we look at how the reliable sources see it, and follow their lead. It's a pragmatic solution, but since there's no clear way of doing this in a more principled manner, it's still the best solution we have found so far, and might ever find. Gawaon (talk) 19:07, 13 June 2025 (UTC)[reply]
  • Sorry, but I've lost track of what you're advocating here. we look at how the reliable sources see it, and follow their lead. So how do we pick out these reliable sources? Do we select them first (an editorial role) or do we throw everything into Google and let Ngrams count the simple majority? Andy Dingley (talk) 22:00, 13 June 2025 (UTC)[reply]
Should it be sources used? Sammy D III (talk) 23:52, 13 June 2025 (UTC)[reply]
There's often a difference between the specialist sources that are most reliable for facts, and the general sources that talk about a topic. That's discussed in the essay WP:SSF. There's no very clear way to survey all the reliable sources and quantify their usage preferences, and the book n-grams are one useful tool, sometimes giving a very clear result and sometimes not. Dicklyon (talk) 16:24, 14 June 2025 (UTC)[reply]
I don't have a strong opinion on this, and I'm not "advocating" anything. All I'm saying is that when it comes to battles and other events, usage decides what becomes sufficiently well established to be considered a proper name and what not, so there's no hard-and-fast way we could make that decision by ourselves without looking at the texts dealing with any given event. (Or, if there is one, I sure haven't yet heard of it.) Gawaon (talk) 07:33, 14 June 2025 (UTC)[reply]

Andy, you've said that we agree that 'names and other proper nouns should be capitalised'. But no, because you left out "proper" in front of "names". Many things have names that are not proper names, like that M40 gun motor carriage that we didn't yet reach consensus on. With "gun motor carriage" having been used generically before the US Army used it, and with the Army's own manual for this thing using the lowercase form in sentences, it would seem clear that it's not a proper name, but you insist it is. Therein lies the difficulty. Dicklyon (talk) 18:14, 14 June 2025 (UTC)[reply]

  • If something is consistently done, without [significant] variance. This inherently relates to the frequency with which something is done and the proportion (frequency) with which it is done the same way. Cinderella157 (talk) 00:54, 23 June 2025 (UTC)[reply]

What alternatives do we have for assessing the degree of capitalization in sources, or for countering claims of "it's a proper name" when the source stats show clearly that it's not usually capitalized? For titles, we at least have RM discussions. But for non-titles, like we have here, how do we go about settling such an argument if we can't even bring in the best and largest set of evidence that exists? Dicklyon (talk) 04:29, 23 June 2025 (UTC)[reply]

American English vs. British English

[edit]

It looks like there are differences in capitalisation between the two major versions of English. Some WP articles explicitly call for use of one or the other via the {{Use American English}} and {{Use British English}} templates. There are differences between the two when looking at capitalisation:

In general, it seems that recent American English sources show a preference for capitalisation. If an article is to be written in one or the other, and has a template to that effect, should we also only look at the corresponding Google Ngram corpus? TurboSuperA+(connect) 16:52, 15 June 2025 (UTC)[reply]

No, let's not overcomplicate things. Deciding capitalization issues is already complicated enough, filtering texts by their origin would just add a further complication without any real benefit. Also, for most articles (except those where TIES help to resolve the issue) it's essentially random which English variant is used, making it in unsuitable basis for further decisions such as regarding capitalization. Gawaon (talk) 17:59, 15 June 2025 (UTC)[reply]
My impression is that a lot of recent junk books based on Wikipedia get labeled as American English, even if they're not, so that's one more way that n-grams can get biased. I agree with Gawaon that this is not likely to ever be a meaningful or important signal to us about the provisions of MOS:CAPS. And generally, you should discount stats from after Wikipedia adopted a particular capitalization, as WP is unreasonably effective at influencing sources; it becomes circular. Dicklyon (talk) 02:41, 16 June 2025 (UTC)[reply]
so that's one more way that n-grams can get biased.
And yet Google Ngram is the first and last argument in any move discussion, for example here. If it's so problematic, why is it sufficient? TurboSuperA+(connect) 12:37, 16 June 2025 (UTC)[reply]
It was the first there because it was easy and made a very clear case; it's not the last, as you can see. Dicklyon (talk) 00:07, 18 June 2025 (UTC)[reply]
The Google Ngrams options to constrain searches to US vs. UK English are not actually reliable. It turns out that the corpora used to generate them were A) small, B) not actually constrained by author/publisher location in the first place, but by where the books were obtained, and c) do not add up together to form the general English corpus. That is, the stats shown in the huge general English corpus for any given result are unlikey to be mirrored in the results of the much smaller US and UK corpora being added together. Even if this were not the case, this really wouldn't have any ENGVAR implications anyway; ENGVAR is concerned with differences between UK, US, and other varieties of English that are actually documented, provable, and pervasive (like colour vs. color). Iffy statistical suggsetions of a slight lean this direction or that on some question in one dialect versus another would be insufficient to surmount MOS:COMMONALITY.  — SMcCandlish ¢ 😼  20:00, 16 June 2025 (UTC)[reply]
  • This might only be an issue in a case of strong national ties and could be argued for Gettysburg campaign but not Abbasid revolution. Because this is a statistical question with often large variations from year to year (ie signal to noise) there is an inherent issue with using a very limited number of data points (ie the final year result) as representative (cf this ngram which effectively smooths out the random fluctuations and would indicate we should use lc). In general, the practice is to use the combined corpus (eg here, which isn't significantly different). Whether we should use a more specific corpus is a matter for debate in a specific RM. While the national corpuses available are for English or American, other varieties of English fall generally to one or the other. The limited selection is not a general issue but it may be an issue for a particular RM debate. Overall, KISS applies. Cinderella157 (talk) 01:18, 23 June 2025 (UTC)[reply]

RfC on the use of Google Ngram

[edit]

RFCBEFORE: Wikipedia talk:Manual of Style/Capital letters#It is time we talked about Google Ngram
Discussion at RSN: Wikipedia:Reliable sources/Noticeboard#Google N-grams and 'consistent' answers

Should Google Ngram be deprecated in rename/move discussions?

  • Yes
  • No

@Cinderella157, Dicklyon, Sammy D III, Myceteae, Gawaon, Andy Dingley, Intothatdarkness, SchreiberBike, Hawkeye7, Blueboar, Rally Wonk, Stepwise Continuous Dysfunction, FactOrOpinion, NatGertler, Yesterday, all my dreams..., Randy Kryn, Chicdat, AjaxSmack, SMcCandlish, and Kowal2701: Pinging participants in the MOS:CAPS discussion, the RSN discussion, and those who might be interested in this RfC. I also left an rfc notice at Village Pump (policy), WikiProject English Language, WP:NCCAPS. If I forgot someone, I am terribly sorry. TurboSuperA+(connect) 13:58, 16 June 2025 (UTC)[reply]

‘’’No’’’, both because deprecating something from discussion is not a coherent suggestion, and because “useful” does not mean “perfect”. If you want to put together an essay to be used in responding to people trying to use it as a definitive statement, go ahead. Nat Gertler (talk) 10:11, 17 June 2025 (UTC)[reply]

Clarification. This question and RfC applies specifically to MOS:CAPS move/title discussions where the move is done to lowercase or uppercase letters in the name. Google Ngram should never be used and should be ignored in determining consensus. 19:12, 16 June 2025 (UTC)

  • Yes. There is simply way too many problems and uncertainty in the results. Here Dicklyon shows how self-published books skew Google Ngram results. Here I show that results from the British English and American English corpuses are different, and Gawaon saying that there is no reliable way to tell which type of English it is. Here Hawkeye7 points out that there is no way to see the context how a word/term is used from Google Ngram results. Deprecating Google Ngram would also prevent low-effort move requests and editors would actually have to examine reliable sources. I believe this would cut down on the volume of requests and pave way for actual discussion, rather than throwing up a Google Ngram link and thinking that that is all that is required. Ultimately, using Google Ngram is more trouble than it's worth. As to alternatives, we can always examine how the term is capitalised in the sources cited in the article, there's Google Scholar and good-old fashioned discussion. TurboSuperA+(connect) 14:02, 16 June 2025 (UTC)[reply]
  • N-grams have value although not the final end-all of discussions. "Official names" should have much more influence, as they are usually the common names that the public uses and attributes as proper names. The ban on counting official names as "official" has always been a head-scratcher to me. Randy Kryn (talk) 14:12, 16 June 2025 (UTC)[reply]
    Randy, you mistake the policy. We don’t ban official names… we simply favor whatever is the more commonly used name. As you note, that often is the “official” name… but not always. Blueboar (talk) 14:50, 16 June 2025 (UTC)[reply]
  • No - Ngrams are a useful data point in move discussions. They should not be the only data point, but they should be examined and “in the mix”. Blueboar (talk) 14:50, 16 June 2025 (UTC)[reply]
  • Clarification requested:
  1. I take this means we would extend the definition from deprecated sources (which applies to citing sources in articles) to mean that Ngrams should almost never be used in RMs and related discussions. Further, that in assessing consensus, admins should ignore or give substantially less weight to Ngrams. Is this correct?
  2. Is the intended scope limited to MOS:CAPS-related moves? I would note that Ngram data features in other RM discussions. Ngram is one of the tools mentioned at WP:DPT (part of the Wikipedia:Disambiguation editing guideline) as a tool that may be helpful.
--MYCETEAE 🍄‍🟫—talk 14:42, 16 June 2025 (UTC)[reply]
See responses and follow-up down the page here: [12] --MYCETEAE 🍄‍🟫—talk 18:43, 16 June 2025 (UTC)[reply]
Answered here and subsequently updated in the main RFC question. --MYCETEAE 🍄‍🟫—talk 01:58, 17 June 2025 (UTC)[reply]
  • No. I'm not sure why there's a mention of "rename / move discussions" as this is not WT:MOVE or WT:TALK but will guess it's because MOS:CAPS has been used in such discussions. I'm not sure why there's a mention of "deprecated" but will guess this is a use of the word in its typical English sense rather than the WP:RSP sense. I have no problem with TurboSuperA+ deprecating my use of any source in a discussion, but it should be up to me to decide whether I want to do it for a particular word in a discussion of that word (i.e. in context), and up to the other participants in that discussion whether they want to disagree. Peter Gulutzan (talk) 14:47, 16 June 2025 (UTC)[reply]
Update: I guessed wrong. TurboSuperA+ has changed the RfC. So the word "deprecate" is not being used in the typical English sense, and the RfC affects only WT:MOSCAPS. Now I'm guessing that the sentence "Google Ngram should never be used and should be ignored in determining consensus." only (due to the previous sentence) means "Google Ngram should never be used and should be ignored in determining consensus -- if and only if Google Ngram Viewer results are used in WT:MOSCAPS in move/title discussions where the move is done to lowercase or uppercase letters in the name (i.e. article title)." If so, ignore my "No". Peter Gulutzan (talk) 19:58, 16 June 2025 (UTC)[reply]
  • No per Blueboar. They are a data point that should be considered, but shouldn't be treated as the end-all and be-all. I would also note that this RFC would seemingly apply to all uses of ngrams, not just those involving capitalization. ~~ Jessintime (talk) 14:54, 16 June 2025 (UTC)[reply]
    They are a data point that should be considered, but shouldn't be treated as the end-all and be-all.
    And yet. Not to mention it is easy to manipulate it. e.g.
    - This nomination is literally just a Google Ngram link. In that discussion it was suggested that recent results should be discounted because of a self-published book.
    - Here is another example. Notice, how Dicklyon said "in the last half-century", ignoring the results from before because they don't suit the goal.
    Why should something so easily manipulated to give a result one wants be used as the only argument to rename an article? It really doesn't make sense to me. TurboSuperA+(connect) 18:19, 16 June 2025 (UTC)[reply]
  • Yes I don't think they should necessarily be removed from consideration, but they should NOT be considered the "be all and end all" in discussions, which seems to be the current standard. In my view RS, especially those used in the article in question, should always take precedence. In my experience this often does not happen, and the mighty Ngram is presented as gospel. Intothatdarkness 14:57, 16 June 2025 (UTC)[reply]
  • No. One shouldn't rely on it exclusively, of course, but it can be useful as part of a larger picture to take into account. Gawaon (talk) 15:09, 16 June 2025 (UTC)[reply]
  • Comment it's clearly not RS and I don't think the RSN thread was ambiguous or needed closure. As has already been pointed out, it is an arbitrary corpus interpreted by unreliable OCR that may or may not reflect actual usage trends and is virtually guaranteed to create ghost trends if enough comparisons are generated. Hence, it is reliable only for its own content which will almost always be UNDUE unless mentioned by an actual RS.
    But there's never been a hard-and-fast proscription against any discussion of GUNREL on talk pages. Situationally they can still be useful, it's not common, and the scope for use tends to be narrow, but they do have their purposes.
    From an RM perspective it should be treated as any other GUNREL would be. I don't see the need for a blanket proscription on bringing it up in discussions. 184.152.65.118 (talk) 15:12, 16 June 2025 (UTC)[reply]
  • No. Although I'm awaiting a response to my questions above, I don't foresee a change in my position. Ngram is a useful, if crude, tool for assessing many usage questions relevant to move discussions. To the extent it's overused, misleading, or inappropriate in a particular context, such objections should be raised in RM discussions or elevated to a discussion about updating the MOS/naming conventions for a particular subject area. The guidance at Wikipedia:Search engine test and two BEFORE discussions linked above provide useful considerations for using Ngram. --MYCETEAE 🍄‍🟫—talk 15:53, 16 June 2025 (UTC)[reply]
    The problem is some of those discussions are bludgeoned to death by people who rely on Ngrams to the exclusion of all else. They may have their purpose, but in many instances they've exceeded that purpose and become definitive. Intothatdarkness 15:57, 16 June 2025 (UTC)[reply]
    Example? Dicklyon (talk) 16:05, 16 June 2025 (UTC)[reply]
    When folks are WP:BLUDGEONING they should be warned and subject to disciplinary processes. If the problem is editor conduct, giving Ngram the scarlet letter is an inappropriate remedy. --MYCETEAE 🍄‍🟫—talk 17:31, 16 June 2025 (UTC)[reply]
  • Neither – it would be censorship to try to stop people from deprecating n-grams in rename/move discussions, and we should not be telling editors whether they should do so. If you can rephrase the RFC question to ask what you actually intended, I'll be happy to give a more in-depth answer. For now, look to the section you linked for info that refutes things like "there is no way to see the context how a word/term is used from Google Ngram results". Context is one the most important things the n-gram stats can get you information about. Dicklyon (talk) 16:16, 16 June 2025 (UTC)[reply]
  • Comment (Summoned by bot) Please have mercy on people who are unfamiliar with Google Ngram and provide a precis of just what it is. Some of us are summoned by the bot and would like to make an intelligent comment, which we can't do if we don't have sufficient information on the subject of the discussion. Thanks in advance. Coretheapple (talk) 16:24, 16 June 2025 (UTC)[reply]
    The relevant background is in the section #It is time we talked about Google Ngram, a bit higher on this talk page. This RFC is a premature fork of that conversation, and ought to be closed pending some discussion of what a sensible question might be, if any. Dicklyon (talk) 17:20, 16 June 2025 (UTC)[reply]
    Google Books Ngram Viewer provides a general overview of Ngram beyond the current context. In RM discussions, it is often invoked as evidence in resolving various style and usage questions (or attempting to). Whether or not a word or phrase is usually capitalized in books, and whether the threshold aligns with the wording of MOS:CAPS and WP:NCCAPS is a not uncommon point of discussion and can be contentious. It is also raised in WP:COMMONNAME and other non-capitalization discussions, such as which of two or more synonyms is most often used to name a subject and whether particular usage is sufficiently common to be appropriate for natural disambiguation or a descriptive title. It is mentioned in non-capitalization contexts at Wikipedia:Search engine test and WP:DPT to give a sense of some other ways it may be used on WP. --MYCETEAE 🍄‍🟫—talk 17:49, 16 June 2025 (UTC)[reply]
    @Coretheapple, Google N-grams is a specialized search engine. It searches book contents rather than webpages. If you enter one or more words/phrases, it searches the contents of a very large set of books that have been printed over a very large range of time (it includes several centuries of books, and if I'm remembering right, the corpus is ~7 million books), and the output of searches are graphs showing how the relative frequency of the chosen words/phrases changes over time. It's case sensitive, and you can set some variables, such as what time range you want to search. Here's their example, and you might want to quickly try out a couple of your own choices. If you want more info, click on "About Ngram Viewer" at the bottom of that page. FactOrOpinion (talk) 17:55, 16 June 2025 (UTC)[reply]
    Can you show me please the results of these three precise strings? I'm struggling. (I've lost track of all these conversations but there is a move discussion at Five Freedoms).
    • Five Freedoms of Animal
    • Five Freedoms of animal
    • Five freedoms of animal
    I also say NO to deprecating ngram in general. YES to deprecating it from discussions based on capitals. Ngram cannot say from where in a book, article or newspaper headline they were used, their authors may have had good reason to capitalise something that shouldn't be here, or vice-versa. Rally Wonk (talk) 18:14, 16 June 2025 (UTC)[reply]
    YES to deprecating it from discussions based on capitals.
    That was the intent behind the RFC. I should have made it more clear, but for some reason thought the context of the discussion (and the linked RFCBEFORE) would give that clue. TurboSuperA+(connect) 18:20, 16 June 2025 (UTC)[reply]
    Sure, here's the result from 1900 forward. It shows no results at all for the second and third of the three, only the first, and no (or next to no) results prior to 2000. FactOrOpinion (talk) 18:38, 16 June 2025 (UTC)[reply]
    Thank you. So no results for "Five Freedoms of animal", yet when I click 'Search in Google Books', the third result returns use of "Five Freedoms of animal" written mid-sentence in prose. This tool cannot be used for caps discussion. YES to the clarified proposal. Rally Wonk (talk) 18:49, 16 June 2025 (UTC)[reply]
    In the case-insensitive n-grams for Five Freedoms of Animal, you get about equl numbers of "Five Freedoms of Animal" and "five freedoms of animal". Other combinations of capitalization fall below the threshold number of books to be counted, so don't show up. It's hard to tell how much less frequent "Five Freedoms of animal" is, just that it's less. The tool can be used for what it shows; understand its limits when using it. Dicklyon (talk) 21:46, 16 June 2025 (UTC)[reply]
    I appreciate the responses. Perhaps this RfC could be rephrased to guide the ignorant. Coretheapple (talk) 22:26, 16 June 2025 (UTC)[reply]
  • Comment I'm not entirely clear on the question (e.g., what does it mean to "deprecate" in this context, where we're not talking about a source for article content?). Given that Google N-grams are sometimes used in discussions (whether for move discussions or something else), I think it would be helpful to have a statement somewhere in a guideline specifying that it is not a source, but is instead a search engine, and it searches a large corpus of books, but where we have zero way of knowing what percentage are books that we would deem to be RSs, and zero way of knowing how representative this corpus of books is among all books in English, nor whether the results of looking at word/phrase use in books is essentially the same as their use in other formats (e.g., newspapers). The search does not distinguish among capitalization in a chapter title, in the middle of a sentence, ... There are optical character recognition errors and date metadata errors. Because the graphs represent proportional occurrence, one can be mislead about the frequency in American English vs. British English if the book corpus doesn't include the same # of words from each, and it's unclear how it treats English variations outside of the US and the UK (e.g., is the capitalization in Nigerian English, Indian English, etc., the same as British English?). Because the graphs represent proportional occurrence, the areas under two graphs isn't as meaningful as it might be. There are limitations in assessing the context of a given N-gram. Is it worthless in assessing information for WP's purposes? Probably not. But it certainly shouldn't be held out as some definitive result. FactOrOpinion (talk) 17:02, 16 June 2025 (UTC)[reply]
    It seems to me that indeed nobody knows what exactly "deprecate" is supposed to mean in this context. That surely is a problem with this RfC. Gawaon (talk) 17:38, 16 June 2025 (UTC)[reply]
    @TurboSuperA+, several of us would appreciate your clarifying what you mean by "deprecate" in this context. FactOrOpinion (talk) 17:57, 16 June 2025 (UTC)[reply]
    It means that it would no longer be allowed to use it to argue one way or another in moves/renames when the rename is done to lowercase or uppercase the letters of the title. TurboSuperA+(connect) 18:13, 16 June 2025 (UTC)[reply]
    One other limitation that I didn't include earlier: it allows one to select English fiction as the corpus, but it does not allow one to select English non-fiction as the corpus. I don't know that they provide info anywhere about the relative sizes of these two subsets of their overall English corpus. FactOrOpinion (talk) 20:01, 16 June 2025 (UTC)[reply]
@Myceteae To no.2, yes, I meant it for MOS:CAPS discussions. I thought that would be clear since we're on the MOS:CAPS talk page. To no.1, also yes. Ideally they would be ignored. Right now, many moves use Google Ngram results as the only argument for or against a rename. TurboSuperA+(connect) 18:23, 16 June 2025 (UTC)[reply]
@TurboSuperA+ it might be worth making a brief, clearly identified update to the question to clarify your intended meaning: That Ngrams should be rarely (or never) used and should be discounted (or ignored) in determining consensus, and that this applies specifically to MOS:CAPS move/title discussions. I was pretty sure this is what you meant but wanted to confirm for myself and mainly for the benefit of others who are less familiar with these debates. I see that versions of these questions have been raised by a few others but it's still early and would benefit the rest of the discussion to clarify this up top while making it clear that this was added up top later. --MYCETEAE 🍄‍🟫—talk 18:52, 16 June 2025 (UTC)[reply]
Done. Thank you for the advice. TurboSuperA+(connect) 19:13, 16 June 2025 (UTC)[reply]
  • Like some other commenters here, I'm not sure what "deprecate" means in this context. How would MOS:CAPS "express disapproval" of Google Ngram results? A note to the effect of "Google Ngram results are not accurate for matters of capitalization and should be the sole basis for determining article titles"? Or a stricter prohibition?
I'm fine with editors mentioning or linking Google Ngram results; I do so myself sometimes with "peruse the results yourself". But I agree some here that the results can highly misleading. In addition to other problems already mentioned, Google Books sources can be highly imbalanced in certain instances. In one case, I saw a sudden spike in usage of a term in the 1950s and upon investigation found that that spike was entirely due to its use in UN documents. Google had a large number of these documents scanned, and they swamped usage of other print materials in the same era. —  AjaxSmack  18:52, 16 June 2025 (UTC)[reply]
  • Obviously no. WP is not going to ban the use of a tool that is frequently crucial, just because a handful of individuals do not know how to use it properly. This is an attempt to end-run around the first rule of MOS:CAPS (only words and phrases that are consistently capitalized in a substantial majority of independent, reliable sources are capitalized in Wikipedia). If proponents of unnecessary over-capitalization succeed in banning one of the primary means of establishing the capitalization rate in source material then the assessment for many topics would be difficult or impossible, so the RM results would probably come down to whoever screamed loudest (and we all already know that'll be the single-topic editors demanding unnecessary capitalization in their pet topic).  — SMcCandlish ¢ 😼  19:44, 16 June 2025 (UTC)[reply]
    reliable sources
    Google Ngram doesn't only check reliable sources, but self-published books as well. So it would seem that that particular sentence from MOS:CAPS precludes the use of Google Ngram by definition. TurboSuperA+(connect) 19:48, 16 June 2025 (UTC)[reply]
    In addition to not checking only reliatble sources, it also doesn't check only independent sources. It doesn't filter out headlines, proper nouns, captions, indexes, et. Thryduulf (talk) 20:02, 16 June 2025 (UTC)[reply]
    The SPS problem is avoided by constraing searches to 2019 and earlier. The expansion of the corpus after 2019 is when junk books were dumped into the data set. The latter problem, of distingushing running-text usage from usage in title-case headlines, captions, etc., is avoided by using a series of carefully selected searches. Everyone familiar with these tools already understands this. They are not a be-all and end-all tool, and one is apt to get more useful results from a Google Scholar search, but that doesn't make the tool useless.  — SMcCandlish ¢ 😼  20:18, 16 June 2025 (UTC)[reply]
    Is there an essay anywhere about what "everyone familiar with these tools already understands"? FactOrOpinion (talk) 20:41, 16 June 2025 (UTC)[reply]
    The SPS problem is avoided by constraing searches to 2019 and earlier. The expansion of the corpus after 2019 is when junk books were dumped into the data set.
    So Google Ngram can never give us an accurate representation of capitalisation in contemporary sources. Another reason to stop using it. TurboSuperA+(connect) 21:12, 16 June 2025 (UTC)[reply]
    I agree I would not try to use it for "contemporary" usage, unless you intend to show how Wikipedia capitalization influences contemporary usage. It's much more meaningful to look at usage up to, for example, the time a wikipedia article was created, before WP has had time to feed back to stats by influencing usage. Dicklyon (talk) 21:41, 16 June 2025 (UTC)[reply]
    This plus the polluting of the corpus after 2019 may decrease Ngram's utility over time. There's been some acknowledgement of the trend towards capitalization in some of the revolution RMs that were deemed to not yet meet our threshold. With capitalization and many other RMs where Ngram data is raised, it often appears usage is headed in a particular direction but is not yet ripe for a particular title, and the suggestion is raised that we revisit in a year. This doesn't change my view that Ngram is often useful, to be clear. --MYCETEAE 🍄‍🟫—talk 02:09, 17 June 2025 (UTC)[reply]
    I dunno, the anti-capitalist editors scream pretty loud, too. Give yourself some credit! --MYCETEAE 🍄‍🟫—talk 01:46, 17 June 2025 (UTC)[reply]
  • Yes for capitalisation, mostly for other uses. Ngrams are a single datapoint that is sometimes interesting, but the complete lack of any context to the usage and mix of reliable and unreliable sources means that it essentially never actually useful in determining what capitalisation is appropriate. When the discussion is about which term is more commonly used and there is no evidence of a British/American English split and there is no evidence that usage differs in different contexts then it can be useful evidence but it is never conclusive on its own. Thryduulf (talk) 20:00, 16 June 2025 (UTC)[reply]
  • No. Corpus-based data is useful for COMMONNAME-based RMs, since in many circumstances they can survey usage more broadly than an editor searching by hand would be able to. On those grounds alone, I think anything as sweeping as this RfC proposal is likely to do more harm than good. That's not to say that Ngrams should be used uncritically—they have limitations, like any potential source of data—but the ideal solution here would be much closer to "information page about the strengths and weaknesses of Ngram data" rather than a full proscription. Even for caps-related uses of Ngrams, which are one of the areas where the corpus' weaknesses are most profound, they can still have some informational value; for example, if Ngrams gives a 50:1 ratio for one capitalization or another, it's all the likelier that the presence of headlines and titles aren't significantly skewing the numbers. ModernDayTrilobite (talkcontribs) 21:07, 16 June 2025 (UTC)[reply]
  • Yes per Thryduulf. Ngrams might possibly be helpful if they were used with greater caution and nuance, but I don't see a realistic path forward in which that happens, so I think deprecation is a preferable alternative to the status quo. LEPRICAVARK (talk) 23:31, 16 June 2025 (UTC)[reply]
  • Yes - go the deprecation route. GoodDay (talk) 23:42, 16 June 2025 (UTC)[reply]
  • No: Ngrams are are flawed, but every simple way of answering a complex question is flawed. Discussions should include criticism of of all methods including Ngrams. Capitalization is not an easy question. Every definition of proper noun fails when examined closely. Our consensus process is the best way we've found and it's not perfect, but it's working fine. SchreiberBike | ⌨  11:17, 17 June 2025 (UTC)[reply]
  • Yes as I said before, they have too much randomness. Yesterday, all my dreams... (talk) 16:38, 17 June 2025 (UTC)[reply]
    They do have some problems, but randomness is not among them. Dicklyon (talk) 00:04, 18 June 2025 (UTC)[reply]
    Do you know how they determined which books to include in their corpus? Was it a random sample, or a representative sample, a convenience sample, ...? FactOrOpinion (talk) 00:15, 18 June 2025 (UTC)[reply]
    Not exactly, but I know they started scanning all the books in about a half dozen major university libraries, and after that got direct feeds of already-digital books from publishers. Basically, anything they could get, even though in recent years so many books are just wiki-derived, self-published, or AI slop. Our article Google Books says, "As of October 2019, Google celebrated 15 years of Google Books and provided the number of scanned books as more than 40 million titles. Google estimated in 2010 that there were about 130 million distinct titles in the world ..." Since several of the libraries were in the US and UK (see Google Books#Initial partners), I presume they've covered a larger fraction of English-language books than other languages, and I presume that the libraries tend to bias the collection toward "reliable". But it's basically everything, not a sample. Dicklyon (talk) 02:53, 18 June 2025 (UTC)[reply]
    I've now done hunted down a bit more info, and it's nowhere near "basically everything." The corpus for the Ngram viewer is a proper subset of the Google books corpus: "The first version of the data set, published in 2009, incorporates over 5 million books. These are, in turn, a subset selected for quality of optical character recognition and metadata—e.g., dates of publication—from 15 million digitized books, largely provided by university libraries. ... The second version, published in 2012, contains 8 million books." (source) I haven't been able to find the size of the 2019 corpus, but if the ratio is about the same as in the first corpus (1/3), then it's ~13.3M books, which is less than 9% of all books (based on this estimate of total current books). I'm guessing that the fraction of Google Books that came from university libraries has dropped over time. It's definitely a not a random or representative sample. It's more of a large convenience sample.
    For WP's purposes, it doesn't even make sense to me to weight each book equally. FactOrOpinion (talk) 14:49, 18 June 2025 (UTC)[reply]
    Thank you for correcting my mis-impressions. Dicklyon (talk) 16:43, 18 June 2025 (UTC)[reply]
    It's interesting that the main problem that paper highlights is the over-representation of specialist sources, as opposed to being representative of general usage. Not surprising, and likely another reason why capitalization is exaggerated therein. Dicklyon (talk) 16:48, 18 June 2025 (UTC)[reply]
    Any time you deal with huge data sets over long periods of time, randomness come in, one way or another. But based on your response below, I think you know that now. But no worries... Yesterday, all my dreams... (talk) 22:03, 18 June 2025 (UTC)[reply]
    My real point was just that the unknowns and biases are not random, we just don't know exactly what they are. But if you want to think of them as random, that's fine too. Dicklyon (talk) 22:48, 18 June 2025 (UTC)[reply]
  • No. It still has some value in determining the common name. We shouldn't only being using Ngrams, though. Mellk (talk) 20:55, 17 June 2025 (UTC)[reply]
  • No way. Maybe a set of best practices on how it is used might be reasonable, but to entirely exclude evidence entirely because what seems like a dispute between the proposer and a single user, Dicklyon, on how it should be used is excessive to an extreme. If how it's used is believed to be incorrect, explain what the errors are and present better evidence. Alpha3031 (tc) 07:42, 19 June 2025 (UTC)[reply]
    explain what the errors are and present better evidence.
    WP:NGRAM (work in progress, links need to be made into permanent ones, etc.) TurboSuperA+(connect) 08:32, 19 June 2025 (UTC)[reply]
    You're clearly free to write whatever essay you want, but I think it would be much more useful to create an essay that addresses both the problems and what knowledgeable users of this tool have learned over time about how to improve the reliability of results (e..g, above, @SMcCandlish said "distingushing running-text usage from usage in title-case headlines, captions, etc., is avoided by using a series of carefully selected searches. Everyone familiar with these tools already understands this," but didn't elaborate on what such a series of carefully selected searches looks like). I also suggest that you reread this entire thread to see what other relevant points have been made. FactOrOpinion (talk) 14:46, 19 June 2025 (UTC)[reply]
    It is just the beginning, note where I say "work in progress" meaning "not finished". It does include comments from this thread (and will continue to include them as they come in). I only linked it because the editor asked for evidence of problems/unreliability, and it was a convenient way to provide it. We can talk about the essay on my talk page or anywhere else other than this RfC, so we don't derail it. TurboSuperA+(connect) 16:05, 19 June 2025 (UTC)[reply]
    re: the editor asked for evidence of problems/unreliability, I meant you should point out the specific issue at each specific RM, or find a better indicator. Alpha3031 (tc) 08:48, 21 June 2025 (UTC)[reply]
  • No The question of what we should cap on WP is essentially a statistical question, since it would be impossible/unreasonable to identify all sources using a particular term in prose. Consequently we look to identifying a sample of sources to determine the proportion of usage. Note that the proportion of usage is the key issue - it is not a source war determined by which side can produce the most sources. The virtues of ngrams are, that they draw on a large sample which is free from observer bias originating with the WP editors interrogating the sample. However, even ngrams may not have a sufficiently large enough number of use for a particular term (search string) to reasonably address the question and a bias toward technical/academic sources does exist. Where I have seen objections to the use of ngrams in discussions this largely occurs where editors supporting capitalisation see the results as being at odds with their perceptions of what the results should be. Yes, there are some imperfections with ngrams as a tool as identified herein but these imperfections tend to favour capitalisation. Understanding the strengths and limitations of a tool is the issue here. I believe that this RfC was premature and that the above discussion (#It is time we talked about Google Ngram) still had some way to go in providing useful comments on how ngrams can be effectively used and when not. In interrogating any sample set of sources for a particular usage on a time basis (eg by year) will evidence varies each year - sometimes greatly. This is what we see in any ngram. Ngrams provide a smoothing function to help deal with this randomness. It is not a fault of ngrams that the data can have a significant deal of randomness - it is the nature of the data (the beast). To the modified RfC question, there are reasonable instances where ngram evidence alone would be sufficient to initiate an RM, though it is good practice to confirm ngram results, at least against google book results. The utility of ngrams in capitalisation discussions (RMs) is a matter to be determined on a case by case basis. Cinderella157 (talk) 00:33, 23 June 2025 (UTC)[reply]
    Note that the proportion of usage is the key issue the issue is that Google ngrams do not provide a reliable indication of the proportion of usage because of all the deficiencies identified in this thread and elsewhere. Thryduulf (talk) 09:39, 23 June 2025 (UTC)[reply]
    Yes, it has deficiencies. However, we have nothing better, and with some work we can work around many of those deficiencies. Everything has flaws. If anyone has a better idea of how to evaluate a large corpus of published data for capitalization, I'm open to it, but let's not remove the best we have available.  SchreiberBike | ⌨  11:13, 23 June 2025 (UTC)[reply]
    It might be the best available, but that doesn't mean it is good enough. Its deficiencies mean that discussions regarding capitalisation are the area where its results are the least reliable at representing what we are attempting to measure (the prevalence of different capitalisations in the running prose of reliable sources using the term in the same context(s) as the article). In every other aspect of determining Wikipedia content we restrict ourselves to using sources that are both reliable and relevant, even if there are sources that are significantly easier to access that don't meet those requirements. Titles of articles should be no different. Thryduulf (talk) 11:25, 23 June 2025 (UTC)[reply]
    We have the sources cited in the article, which are all (or should be) RS. Meanwhile there is no guarantee Google Ngram is searching through reputable sources only. Not to mention the results can be gamed: A single, prolific author is thereby able to noticeably insert new phrases into the Google Books lexicon, whether the author is widely read or not.[13] Which is made worse by the fact that anyone can "publish" a book, buy an ISBN and have it added to Google Library/ngram results. TurboSuperA+(connect) 11:28, 23 June 2025 (UTC)[reply]
    "with some work we can work around many of those deficiencies" How? If it's going to be used, it would be helpful for you to list the problems for which you think there's a work around and what that work around is. FactOrOpinion (talk) 12:18, 23 June 2025 (UTC)[reply]
    Search on this page for "Tips for using n-grams". SchreiberBike | ⌨  00:04, 24 June 2025 (UTC)[reply]
    OK, but none of those address many of the issues that have been raised here, the most significant of which is that the corpus is not limited to RSs, and there is no way to know whether the distribution in RSs would be different than in the corpus as a whole, nor how representative the corpus is of all books (and of course, our RSs are sometimes texts like newspapers that aren't part of the corpus at all). FactOrOpinion (talk) 01:00, 24 June 2025 (UTC)[reply]
    This is an RfC about no longer allowing the use of Google Ngrams in capitalization decisions. Google Ngrams have problems but so does every alternative. The vast majority of the time there is general agreement about capitalization, but sometimes we need to look at the evidence and discuss. We should look at all the alternatives and come to a consensus. No single source of information is decisive, but we can look at and weigh sources based on their value. I feel frustrated that people are searching for some perfect solution and disregarding a process that has worked. SchreiberBike | ⌨  01:35, 24 June 2025 (UTC)[reply]
    I'm aware of what it's an RfC for, and I'm not looking for a perfect solution. I'm simply noting that MOS:CAPS clearly states "only words and phrases that are consistently capitalized in a substantial majority of independent, reliable sources are capitalized in Wikipedia." Google Ngrams does not distinguish RSs from non-RSs. There isn't even any way to assess what % of the corpus are RSs. FactOrOpinion (talk) FactOrOpinion (talk) 02:00, 24 June 2025 (UTC)[reply]
    We can indeed look at and weigh sources based on their value, however we cannot do that using google Ngrams because there is no way of knowing which sources are in its corpus and thus no way of knowing what their value is. Also, the perfect solution not existing is not a reason to use solutions that are bad. Thryduulf (talk) 02:07, 24 June 2025 (UTC)[reply]
    Exactly this. Intothatdarkness 12:53, 27 June 2025 (UTC)[reply]
Add When doing a case insensitive search on ngrams, the casing in the search term has no effect on the search result (compare here and here). The assertion it does is incorrect.
The capitalisation of moon is a conundrum because different people have different views on where or when it should be capitalised. Ngrams do have the capacity to be tailored to capture/represent different contexts. If moon (the earth's moon) is widely known as a proper name through the English language, why do we see near equal capitalisation here, when, as a rule, proper names are always capped? We can the ngram with sources (eg here) which indicate a similar result for capitalising moon. Even looking at the sources used in the article on the Moon, the reference section shows that moon is not consistently capitalised.
Perhaps the issue is summarised by this quote, You really used an Ngram to prove that the "Moon" should be lower-cased because the majority of people are ignorant?[14] - ie I know better and the sources are wrong. Opposing the use of ngrams because they don't give a result that one wants, expects or an answer one thinks is wrong is not a good reason to oppose their use. Cinderella157 (talk) 11:57, 25 June 2025 (UTC)[reply]

Yes for objects. I think this attempt to lower-case the Moon can show the weakness of Ngram use.

That was a while ago but Dick Lyon just posted this about it (deliberately no diff): "...things like this data from sources (which is very unlike what we see in an astronomical context). I prefer to stick to what sources tell us..."

The first search, with Armstrong's proper name uppercase, will put the search in a specific context with one person doing one action. The second search, replacing the upper-case proper name with a lower-case "earth", puts the search into a much wider context with different results.

Since we are talking about the Moon, which is widely known as a proper name through the English language, why should only the narrow search be "the sources"? Because of these variations I suggest that Ngrams aren't reliable for proper names of objects. No position on actions and probably useful for Commonname.

Why are we talking about potential sources which can't possibly be checked, instead of sources actually used in the article, which can? Thank you. Sammy D III (talk) 18:44, 23 June 2025 (UTC)[reply]

Armstrong was just an example. You can see lots more: here, and decide which ones you'd like to focus on, or reduce the context like here or here. I believe I had brought that up relative to an edit I had made about landing on the moon. All the n-grams show is that most authors don't capitalize moon in that context. Dicklyon (talk) 03:16, 24 June 2025 (UTC)[reply]
I owe you a sincere apology. I got it wrong. I thought you had tried to move the Moon, not just Commonname it in the text. I have struck it out and I mean it. Not an olive branch, just the right thing to do. Sammy D III (talk) 12:50, 25 June 2025 (UTC)[reply]
Yes, I have now succeeded in unilaterally moving it to moon. I hope that's OK. It's not as if we didn't discuss it.[Joke] Dicklyon (talk) 22:27, 25 June 2025 (UTC)[reply]
Single-handedly moving the moon, now that's quite an accomplishment! Gawaon (talk) 15:22, 29 June 2025 (UTC)[reply]
With even more searches and results I think you have reinforced my point: "Because of these variations I suggest that Ngrams aren't reliable for proper names of objects". Sammy D III (talk) 03:26, 24 June 2025 (UTC)[reply]
I'm puzzled re what you're trying to say. Nobody is claiming that Ngram are "reliable for proper names of objects", whatever that means. They just show usage statistics. Dicklyon (talk) 04:01, 24 June 2025 (UTC)[reply]
The proper name of Earth’s only moon is Luna. She’s lovely, unlike Phobos and Deimos. SmokeyJoe (talk) 13:44, 17 July 2025 (UTC)[reply]

His father is Black and his mother is white.

[edit]

At the end of Tyrese Haliburton#Early life and family, it says His father is Black and his mother is white. Is this mixed capitalization of races appropriate? If not, what is the consensus on how to treat them? FWIW, the cited source uses that exact style, but evidently this appears contrary to MOS:RACECAPS. Left guide (talk) 06:07, 21 June 2025 (UTC)[reply]

I think the accepted WP practice is apply it consistently on a given page, whatever style is chosen. —Bagumba (talk) 06:23, 21 June 2025 (UTC)[reply]
Yes, a note in MOS:RACECAPS says "The status quo practice had been that either style was permissible, and this proposal did not overturn that". I too would interpret that as meaning that both capitalized style and lower-case style is permissible, as long as it's used consistently on any given page. Mixed usage is not accepted – the proposal to capitalize only "Black" failed to reach consensus. Gawaon (talk) 07:14, 21 June 2025 (UTC)[reply]
I don't understand that answer. Shouldn't it be No if either capitalized or lowercase is acceptable, and mixed like this is not? Dicklyon (talk) 22:29, 25 June 2025 (UTC)[reply]
I think Gawaon's "yes" is agreeing with Bagumba, not responding to the question in the second sentence of the original post. --Trovatore (talk) 05:03, 26 June 2025 (UTC)[reply]
Yes, I expressed my agreement with Bagumba. Gawaon (talk) 05:15, 26 June 2025 (UTC)[reply]
Yes, it is fine, and not contrary to MOS:RACECAPS. The upshot of the RfCs was that both upper or lower case is acceptable. Hawkeye7 (discuss) 23:51, 25 June 2025 (UTC)[reply]
Hmm, taken at face value that would also seem to allow His father is black and his mother is White., which I suspect would elicit objections. --Trovatore (talk) 00:53, 26 June 2025 (UTC)[reply]
I remain confused. Does "both upper or lower case is acceptable" mean it's OK to do them differently, like Trovatore illustrates? Or not? Or is Black and white OK as in some publications' modernized styles, but black and White not? I'm not saying it's an easy question, just that I don't understand these answers. Dicklyon (talk) 05:01, 26 June 2025 (UTC)[reply]
My interpretation of "either style [is] permissible" in MOS:RACECAPS is that consistency is still required – if something is not consistent, it's not a style, and hence not permissible. So it's not OK to lower-case "white" in one sentence and capitalize it in the next (when both apply to persons), since that's not consistent. Neither is it OK to capitalize "White" and lowercase "black" since that's not consistent. Nor the other way around. Consistency is an implicit requirement here (per our general rules), so both terms must be treated the same. Gawaon (talk) 05:18, 26 June 2025 (UTC)[reply]
Consistency is not required. Although we did not adopt the American practice of capitalising Black only, editors are free to do so. Hawkeye7 (discuss) 05:27, 26 June 2025 (UTC)[reply]
Are they free to capitalize White only? --Trovatore (talk) 05:31, 26 June 2025 (UTC)[reply]
If they want to. I guess you are wondering what the content creators will do with so much editorial freedom. Hawkeye7 (discuss) 08:15, 26 June 2025 (UTC)[reply]
Rereading the note in MOS:RACECAPS, I see that "mixed use" (i.e., capitalized "Black" and lower-case "white") is indeed allowed as well. I stand corrected! Gawaon (talk) 06:08, 26 June 2025 (UTC)[reply]
Should we add a line to the main text clarifying that mixed use is permissible when editors determine this is the appropriate style for a particular article? The added detail in the note is useful but the top line guidance is easily missed. --MYCETEAE 🍄‍🟫—talk 20:22, 28 June 2025 (UTC)[reply]
Why not? It sure would make things clearer. Gawaon (talk) 21:02, 28 June 2025 (UTC)[reply]
 Done Special:Diff/1299365122. I tried to stick very closely to the wording in the note to reflect that this is a mere clarification and not a change but additional wordsmithing may be in order. --MYCETEAE 🍄‍🟫—talk 00:26, 8 July 2025 (UTC)[reply]
That is not how I read the results of that discussion. I read it as either Black and White or black and white is acceptable, but that Black and white or black and White is not. --User:Khajidha (talk) (contributions) 15:23, 23 July 2025 (UTC)[reply]
It's what the note in that section has been saying for a long time, however: "with no consensus to implement a rule requiring either or against mixed use where editors at a particular article believe it's appropriate" (emphasis added). Since there was no consensus against mixed use (Black, but white), it's allowed. Gawaon (talk) 17:11, 23 July 2025 (UTC)[reply]

Arbitration notice

[edit]

There is an arbitration case involving this topic at Wikipedia:Arbitration/Requests/Case#Capitalization Disputes. Left guide (talk) 20:52, 26 June 2025 (UTC)[reply]

Clarifying MOS:DOCTCAPS

[edit]

Doctrines, ideologies, philosophies, theologies, theories, movements, methods, processes, systems or schools of thought and practice, and fields of academic study or professional practice are not capitalized, unless the name derives from a proper name.

I think perhaps this section could do with some more clarity or more examples of what is considered a movement, method, process, etc.

I've been reconsidering capitalization edits I made on Teach the Controversy, but I'm not sure how to interpret this policy in relation to the article/phrase. I consulted previous discussions about this policy, but in most instances it was fairly straightforward and/or didn't apply to this case. I was confused by some instances in article titles/article body, but perhaps they are cases consistently capitalized in a substantial majority of independent, reliable sources: Third World socialism, Third-Worldism, third-world?, Non-Aligned Movement, Manifest destiny, Global War Party.

Considering other Discovery Institute campaigns, there's also "Critical Analysis of Evolution", "Free Speech on Evolution", "Academic freedom campaign" (but Academic Freedom bills). Then there's Intelligent Design and Wedge Strategy which are often capitalized but have lowercase article titles. I feel the capitalization is helpful in the case of Critical Analysis of Evolution because, if lowercase, the words would seem to mean something else. With Teach the Controversy, it's short enough that using "teach the controversy" strategy doesn't feel too repetitive, but always enclosing in quotation marks feels unnecessary. But without any distinction, it could get reinterpreted as teach the [controversy strategy].

In some of the articles I linked, the capitalization of the article title wouldn't match all instances in the body. Is this because MOS:DOCTCAPS is more important for article names and consistency isn't necessary unless it's a problem? Or are these various inconsistencies themselves violations of MOS:DOCTCAPS that just haven't been corrected?  – Kilvin👾 03:42, 18 July 2025 (UTC)[reply]

  • What you appear to be describing are terms of art or a term that has a specialized meaning in a particular field or profession [15]. These fall to MOS:SIGNIFCAPS (as well as DOCTCAPS) - Introduction of a term of art may be wikilinked and, optionally, given in non-emphasis italics on first occurrence - ie once it is identified as a term of art by the use of italics, italics need not be used thereafter. Cinderella157 (talk) 08:01, 18 July 2025 (UTC)[reply]

This was cited for moving "List of assets owned by The Coca-Cola Company" (regardless of the company's capitalization of "the"). The article "The Pokémon Company", however, consistently capitalizes "the". Should both use uncapitalized "the" per this guideline? J3133 (talk) 07:10, 21 July 2025 (UTC)[reply]

Yes. Gawaon (talk) 07:34, 21 July 2025 (UTC)[reply]

antisemitism -v- anti-Semitism

[edit]

The section "Peoples and their languages" states, "antisemitism, which is preferred in wikivoice per the consensus of scholars and historians of antisemitism" – but "consensus" is not true: there is a small majority, but not a consensus (i.e. "general agreement among a group of people"). The Internet Archive lists 2,740 texts with "antisemitism" in the title and 2,114 texts with "anti-Semitism" in the title. I note that The Oxford English Dictionary and Merriam-Webster give only one version of the term: "anti-Semitism" It seems rather an odd formation, given that German has had "Antisemitismus" since the 1870s and French has had "antisémitisme" almost as long but I do not think Wikipedia should fly in the face of the two most authoritative English and American lexical sources. Was the current wording of the MoS discussed, and if so, where can one find the discussion? Tim riley talk 07:05, 1 August 2025 (UTC)[reply]

The last discussion was in March, and Zanahary made the guideline change. Just skimming here, but it seems that the guideline would be stronger if it noted Wikipedia consensus to use "antisemitism" without making a claim about scholars and historians. Firefangledfeathers (talk / contribs) 13:31, 1 August 2025 (UTC)[reply]
No objection Zanahary 16:23, 1 August 2025 (UTC)[reply]
My thanks, Firefangledfeathers. I think the decision you refer to is ill-informed but I shan't raise the matter again. Tim riley talk 19:09, 1 August 2025 (UTC)[reply]

Capitalization "black", "white" and "colo[u]red"

[edit]

There is no single universal rule for capitalizing "black" and "white" when relation to people, although this is more common in some American style guides. It can be nuanced, for example according to The Guardian, Minna Salami, who is a Finnish Nigerian, dislikes capitalizing "black" when reference to people because she opposes the imposition of any single rule regarding how black people should define themselves. In South Africa, the term "colored" should not be capitalized, according to the South African Editorial Style Guide by the government in South Africa. (https://www.gcis.gov.za/sites/default/files/docs/resourcecentre/guidelines/Editorial_Style_Guide.pdf). The Oxford dictionary stated that the capitalization of these terms are a stylistic choice, rather than a strict rule. The term "African American" should not be hyphenated. MarcoToa1 (talk) 09:43, 8 October 2025 (UTC)[reply]

Capitalizing "white" is optional, since it hasn't developed a widespread, accepted cultural identity and community to the same extent. Some also capitalized "white" and "black" like the APA style. MarcoToa1 (talk) 09:45, 8 October 2025 (UTC)[reply]
It's best to ask the writer or author's preference about the capitalization. MarcoToa1 (talk) 09:47, 8 October 2025 (UTC)[reply]
You can compare:
Black people, black people, White people, white people, Coloured people, coloured people
in Ngram and must be case sensitive. MarcoToa1 (talk) 09:49, 8 October 2025 (UTC)[reply]
You can also compare other style guides. MarcoToa1 (talk) 09:50, 8 October 2025 (UTC)[reply]
It should be "compare with". MarcoToa1 (talk) 09:51, 8 October 2025 (UTC)[reply]
This is covered in MOS:RACECAPS. Gawaon (talk) 10:08, 8 October 2025 (UTC)[reply]
Thanks. MarcoToa1 (talk) 12:05, 8 October 2025 (UTC)[reply]