Created and maintained by Martin X. Moleski, SJ.
First draft: 2006-03-20.
Updated:
2008-10-07
.
Note well: I did my level best to assimilate all the input I was given about naming conventions.
This document is a product of a personal good-faith effort to figure them out.
Although I think it is accurate, it has no official standing whatsoever.
2. Rules governing the formation of name components:
2.2 A component should not contain characters other than [a-z0-9+-].
2.3 A component must contain at least one non-digit.
2.4 A component must not contain ANY UPPERCASE LETTERS.
2.5 A component must begin with a letter or digit.
Appendix A: A Few Examples of Existing Big-8 Group Names and Their Descriptions
Appendix B: The Longest Names in the Big-8 (2006-03-19)
References: Various Documents Describing Usenet Protocols and Naming Standards
This is an informal and unapproved set of observations on how to properly form names for newsgroup proposals. It represents what I have recently learned about Usenet naming conventions. I would be happy to revise the document to keep pace with revisions in Usenet policy when and if those revisions become generally accepted. Folks may not use this document to berate the custodians of the Big-8 because the board did not write this document. I did.
Those who wish to read the authoritative documents may skip straight to the References. Those who know how to read such documents may use them as a stick with which to beat the custodians of the Big-8 because these documents do describe the standards, if any, to which all Usenet groups should conform, if the administrators of a news server want them to. (Some news administrators just ignore the authoritative documents and do as they please. Someone should probably visit them and tell them to canonicalize their system!)
After this document was written, PJ Ross wrote a script to check the validity of the name and newsgroup description line.
Any flaws in this document are my fault. Any correspondence to actual Usenet practice is due to the patient answers others have made to my questions. To all of them, my heartfelt thanks.
Examples:
news.groups has two components.
news.announce.newgroups has three components.
rec.aviation.flying has three components.
rec.autos.sport.nascar.moderated has five components.
2.1 The first component in the name of a Big-8 newsgroup will be the name of one of the eight hierarchies:
comp
humanities
misc
news
rec
sci
soc
talkWhen we talk about "the Big-8 newsgroups," this is what we mean: all of the newsgroups that are found in these eight hierarchies.
2.2 A component should not contain characters other than [a-z0-9+-].
That is to say, the characters allowed are:
-- lowercase letters (a to z)
-- digits (0 to 9)
-- the plus sign: +
-- the minus sign: -A few quick examples. More below.
comp.lang.c++.moderated
misc.immigration.australia+nz
misc.industry.pulp-and-paper
rec.arts.sf.tv.babylon5.infoNote Well : The underscore character ("_") is allowed in components by NNTP but is traditionally not used in the Big-8.
2.3 A component must contain at least one non-digit.
In other words, "rec.autos.makers.ford.500" is invalid because the component "500" is made up only of digits.
2.4 A component must not contain ANY UPPERCASE LETTERS.
This is just a repetition of one part of rule 2.2. But sometimes it takes a little while for the meaning of the rules to sink in. "Knowledge maketh a bloody entrance."
Some servers pay attention to the differences between REC, REc, ReC, Rec, rEC, rEc, reC,and rec. These systems are "case sensitive." To the human reader, all these variations are exactly the same syllable. To a computer, each one of these variants is a unique string of bytes; using base-16 representation of the ASCII values:
REC = 52 45 43 Rec = 52 65 63 rEc = 72 45 63 rec = 72 65 63To make no ambiguity arises from typos or random capitalization, the rule for newsgroup names is lowercase only; in some of the earliest Operating Systems for personal computers, the rule was that all filenames WOULD BE STORED IN UPPERCASE ONLY.)2.5 A component must begin with a letter or digit.
Components beginning with the underscore character are reserved for future development in Usenet. So, if all goes well, you won't see a newsgroup name such as "comp._my_computer_is_better_than_yours_".
2.6 Reserved words must not be used as components.
These words have a special meaning for some of the servers that are used to host newsgroups: 'all' and 'ctl'. There may be other reserved words as well ("example", "to", "poster", "control", "junk"--cf. USEFOR).
2.7 As a general rule, components (and names generally) should be as short as is consistent
with comprehensibility.Or one might say that the right length strikes a balance between brevity and intelligibility.
In the early years of Usenet, components could only be 14 characters long because of restrictions on the length of file names. Those limits have been lifted in practice. Wise, kindly, and knowledgeable people recommend keeping the components of names under 20 characters (one proposal) or 30 characters (another proposal).
I see a 497 UTF-8 octet limit on newsgroup names, created by a 497 UTF-8 octet limit on command parameters in RFC 3977. I don't see much else limiting newsgroup names there. Yes, it does mention a prohibition on wildcards in the newsgroup name, but not much else. - Mark KramerThe UKnet and some of the articles in the References section strongly discourage wasted verbage in the description such as "Discussion of ". The vast majority of newsgroups are dedicated to discussion of the topic to which the group is devoted.
Some samples of the longest components in the Big-8 (2006-19-3):
14 characters:nlang-know-rep administration net-management paint-shop-pro net-happenings visual-objects laser-printers19 characters:extreme-programmingObservations from Jim Riley:
There may be cases where a name was shortened to less than 14. For example:rec.arts.books.hist-fiction (12) rec.arts.books.historical-fiction (18) rec.arts.tv.uk.coronation-st (13) rec.arts.tv.uk.coronation-street (17) rec.pets.cats.health+behav (12) rec.pets.cats.health+behavior (15) rec.scouting.guide+girl (10) rec.scouting.??? (?)There are some other instances where two components have been substituted for a longer name, as insoc.culture.african.american soc.culture.asian.american soc.culture.mexican.americanIn many cases, multiple components can provide enough context that the latter components can be shorter (I'm not referring to the above 3 groups)... In the case of extreme-programming (19), the long name helps clarify the meaning of the sub-hierarchy comp.software.*, which is not about software per se, but more
about software development, programming, distribution, etc. It is also a
different usage than in news.software.* which is about software used for
Netnews and Usenet.Here is a curiosity for you. Originally, the entire newsgroup name had to fit in 14 significant characters. We now have rec.sport.* rather than rec.sports.* because when a baseball newsgroup was established, it was realized that net.sport.base(ball) would leave room in the namespace for net.sport.bask(etball). Never mind that when the basketball group was evenutally created it was called ".hoops". >Longest names (46 to 50 characters):
>
> misc.forsale.computers.mac-specific.cards.misc
> misc.forsale.computers.mac-specific.cards.video
> misc.forsale.computers.pc-specific.cards.video
> misc.forsale.computers.pc-specific.motherboardsSomewhat a result of incremental development. Originally there were misc.forsale and misc.forsale.computers. The first was later renamed misc.forsale.non-computers.The first split of the mf.computers, produced several groups including mf.computers.mac and mf.computers.pc-clone. Several other groups failed, including groups for "printers", "software", "storage", and "comm". This is one instance where the 2/3 rule had an effect. Overall, the votes were on the order of 300-odd to 150-odd, with most voters casting straight-ticket Yes or No votes. But there was enough ticket splitting such that the Yes percentage ranged from 63.7% to 71.4%.A later re-organization started out with the intent to add the failed proposed groups which were primarily for peripherals that were _not-specific_ to a particular computer. Then the proposal was changed to incorporate renaming mf.computers.mac and mf.computers.pc-clone to mf.computers.mac-specific and mf.computers.pc-specific to emphasize that the other groups were generic. And finally, the split of the mac and PC groups was proposed (.cards.misc and .cards.video were proposed at the same time). There were 29 separate items were on the ballot, including separate items to create each of the mf.computers.pc-specific.* and mf.computers.mac-specific.* groups, as well as removing the existing mf.computers.mac and mf.computers.pc-clone groups.> rec.games.trading-cards.marketplace.magic.auctions > rec.games.trading-cards.marketplace.magic.sales > rec.games.trading-cards.marketplace.magic.tradesThis is somewhat odd, in that the rec.games.trading-cards.* hierarchy had a .magic, .marketplace, and .misc group when the .marketplace group was split into .magic.*, and .misc groups. Had the groups evolved at an earlier time, they probably would have been rec.games.magic, which split off rec.games.magic.marketplace that was then split into rec.games.magic.auctions, rec.games.magic.sales, and rec.games.magic.trades.
The purpose of the short description is to add information not found in the group name, not simply repeat it. There is no need to use expressions like "discussion of" to introduce the description, since all newsgroups are, by definition, intended to be for discussion.
For most of its history, the Big-8 allowed the group name, tab(s), and description to use all 80 columns. 79 columns is now recommended if one wishes to guarantee the fewest number of unnecessary wraparounds on 80-column terminals; in UKnet, they have dropped the period at the end of the description to free up an extra character. Some people in alt.config also recommend a combined length of no more than 79 characters as does Russ Allbery in his description of the newsgroups file.
When calculating the length of the newsgroups line, "tabs" traditionally fill up to 8 columns in a line per tab, depending on where the cursor is when the Tab key is struck. In the rulers below, I have inserted "|" after the 24th character because the rule requires inserting as many tabs as needed if a group name is shorter than 24 characters. I use "^" to show where the Tab stops are located. When all newsgroup names were short, all of the descriptions would then line up, beginning in the 25th column.
In the examples below, spaces between the group name and the description have been replaced with a middot (·) to show how the tab rule works. I've also changed the font to fixed-pitch to show how the name, spaces-to-the-tab-stop, and description should fit into 80 or fewer columns:
1 2 3 4 5 6 7 8 123456789012345678901234|6789012345678901234567890123456789012345678901234567890 ^·······^·······^·······|·······^·······^·······^·······^·······^·······^······· comp.ai·^·······^·······Artificial Intelligence. comp.ai.jair.announce···Announcements & abstracts of the Journal of AI Research. rec.arts.comics.info····Reviews, convention information and other comics news. rec.arts.startrek.reviews·······Reviews of Star Trek books, episodes, films, &c. rec.arts.comics.marvel.universe·Marvel Comics' shared universe and characters. rec.collecting.villages·Collectible houses, cottages, villages, and accessories. rec.outdoors.fishing.fly.tying··Issues relating to tying flies for flyfishing. sci.op-research·^·······Research, teaching & application of operations research.
Once upon a time, on a faraway server, the tab rule produced a very neat list of names and descriptions, according to Jim Riley:
Yes. I have seen a checkgroups from just after the Great Renaming where
every description except two begins in column 25 (3rd tab position). The
exceptions are for:comp.protocols.appletalk (24) news.announce.conferences (25)At that time the practical effect would have been that all descriptions had a common left alignment, regardless whether they had short names (7 or fewer), ordinary names (8 to 15), or long names (16 to 23), and there was a work-around for exceptionally long names.If you look at the Big 8 checkgroups, there are sections where this alignment still largely works, and sections where deeper hierarchies completely obliterate any alignment.All moderated newsgroups must add the tag "(Moderated)" one space after the period that ends the newsgroup description. This moderation flag does not count against the 79/80-character limit for the newsgroup name, tab, and description.
The period at the end of the newsgroup description may now be optional. Those who have enough space to do so should probably preserve the period for backward-compatibility.
The principles discussed so far are at the lowest level of grammar, spelling, and punctuation. There are higher-level issues discussed in the brilliant and irreplaceable discussion by David Wright, "Guidelines on Usenet Newsgroup Names" (1999). There is an art to locating a new newsgroup in its proper place in the existing lists and there are no rules by which disagreements over the higher-level issues of taxonomy can be resolved. Someone makes a decision, a group gets created, and Usenet goes on.
Miscification: An attentive reader says, "But looking again at those examples, I wonder if it's a good idea to include examples of *.misc groups without explaining the special circumstances in which they're created, i.e. when rec.foo exists and rec.foo.bar is proposed. (Disclaimer: I'm not entirely sure I understand the rules for adding components myself.)"
Russ Allbery: nlang-know-rep "is a great example of a group that got a completely incomprehensible name because we were enforcing the 14-character limit. It's also a great example of a group that's hard to name in even 30 characters, as I believe that was supposed to be natural-language-knowledge-representation. Which I think we'd all agree is too long, but shortening it is... hard.
"But what we've got abbreviates *every* keyword that someone would search
for, making the group far harder to find."
I have just cut these lines from the Checkgroups list posted in news.announce.newgroups. If the formating looks funny, that's because the 'rules' (traditions) for formatting produce funny results when there are longer newsgroup names.
" (Moderated)" is an essential flag for identifying which groups are moderated; the letters in this flag do not count against the 80-character rule.
1 2 3 4 5 6 7 8
123456789012345678901234|6789012345678901234567890123456789012345678901234567890
comp.ai.nat-lang Natural language processing by computers. comp.ai.neural-nets All aspects of neural networks. comp.ai.nlang-know-rep Natural Language and Knowledge Representation. (Moderated) 1 2 3 4 5 6 7 8
123456789012345678901234|6789012345678901234567890123456789012345678901234567890
rec.games.trading-cards.marketplace.magic.auctions Auctions of Magic cards. rec.games.trading-cards.marketplace.magic.sales Selling Magic cards. rec.games.trading-cards.marketplace.magic.trades Trading Magic cards. rec.games.trading-cards.marketplace.misc Trading trading card stuff. 1 2 3 4 5 6 7 8
123456789012345678901234|6789012345678901234567890123456789012345678901234567890
misc.metric-system The International System of Units. misc.misc Various discussions not fitting in any other group. misc.news.bosnia News, articles, reports & information on Bosnia. (Moderated) misc.news.east-europe.rferl Radio Free Europe/Radio Liberty Daily Report. (Moderated) misc.news.internet.announce News bulletins from the Internet. (Moderated) misc.news.internet.discuss Discussion of news bulletins from the Net. 1 2 3 4 5 6 7 8
123456789012345678901234|6789012345678901234567890123456789012345678901234567890
news.announce.important General announcements of interest to all. (Moderated) news.announce.newgroups Calls for newgroups & announcements of same. (Moderated) news.announce.newusers Explanatory postings for new users. (Moderated) news.answers Repository for periodic USENET articles. (Moderated) news.groups Discussions and lists of newsgroups. 1 2 3 4 5 6 7 8
123456789012345678901234|6789012345678901234567890123456789012345678901234567890
rec.arts.books.childrens All aspects of children's literature. rec.arts.books.hist-fiction Historical fictions (novels) in general. rec.arts.books.marketplace Buying and selling of books. rec.arts.books.reviews Book reviews. (Moderated) rec.arts.books.tolkien The works of J.R.R. Tolkien. 1 2 3 4 5 6 7 8
123456789012345678901234|6789012345678901234567890123456789012345678901234567890
sci.crypt Different methods of data en/decryption. sci.crypt.random-numbers Generating cryptographic strength randomness. sci.crypt.research Cryptography, cryptanalysis, and related issues. (Moderated) sci.data.formats Modelling, storage and retrieval of scientific data. 1 2 3 4 5 6 7 8
123456789012345678901234|6789012345678901234567890123456789012345678901234567890
soc.genealogy.surnames.global Surnames queries central database. (Moderated) soc.genealogy.surnames.ireland Surnames queries - Ireland & Northern Ireland. (Moderated) soc.genealogy.surnames.misc Surnames - regions not covered elsewhere. (Moderated) soc.genealogy.surnames.usa Surnames queries - USA. (Moderated) soc.genealogy.west-indies Genealogy of the West Indies. 1 2 3 4 5 6 7 8
123456789012345678901234|6789012345678901234567890123456789012345678901234567890
talk.politics.drugs The politics of drug issues. talk.politics.european-union The EU and political integration in Europe. talk.politics.guns The politics of firearm ownership and (mis)use. talk.politics.libertarian Libertarian politics & political philosophy. talk.politics.medicine The politics and ethics involved with health care.
1 2 3 4 5 6 7 8
123456789012345678901234|6789012345678901234567890123456789012345678901234567890
misc.forsale.computers.mac-specific.cards.misc misc.forsale.computers.mac-specific.cards.video misc.forsale.computers.pc-specific.cards.video misc.forsale.computers.pc-specific.motherboards rec.games.trading-cards.marketplace.magic.auctions rec.games.trading-cards.marketplace.magic.sales rec.games.trading-cards.marketplace.magic.trades
1987: RFC 1036
|
---|
1994: Son of RFC 1036
|
1999: David Wright, Guidelines on Usenet Newsgroup Names
|
2006: I-Ds List Working Group, Usenet Article Standard Update (usefor):
|
2006: Russ Allbery, ISC README
|
Newsgroup name/description validator |
The news.groups thread that led to this article. |