Style Guide

Guidelines here spread consistency to the articles in form. More importantly they may help the readers to understand the contents better.

Formats#

Paragraphs#

Block(-like) Semantics Require Additional Newline#

For block semantic elements, it must be followed by an additional newline, so that goldmark will generate the paragraph p, otherwise some visual styles may not be correctly applied.

Characters#

Punctuation#

Simple Marks#

The difference between CJK and Western punctuation is still relatively large, so the punctuation set of the article should be related to the main language of the article. Consistency should be maintained within the same article.

The following lists the punctuation symbols for different main languages. It is worth noting that the period (。U+3002) commonly used in CJK is replaced by a Western period, and the spacing rule after it is consistent with Western punctuation. This simulates a full-width solid period (.U+FF0E) easily.

Punctuation Western CJK
Comma ,
Period . .
Question ?
Exclamation !
Quotation " 「」
Semicolon ;
Colon :

Dashes#

Dash here refers to a horizontal line character. It has several variants, including:

  • Hyphen - (U+002D): used to form compound words
  • Figure dash ‒ (U+2012): used to separate digits (such as area codes in phone numbers)
  • En dash – (U+2013): used to indicate ranges (numbers, dates)
  • Em dash — (U+2014): used in sentences of text

The normally used dash in CJK is composed of two em dashes (——). In Markdown, ndash and mdash are natively supported for convenience. For ease of use, the cases of figure dashes is merged into hyphens.

I18N#

Language Codes#

The site sticks to the naming rules in BCP 47 for tagging language. As you can see, current site configures the following languages:

Path Name Language Code
en English en-US
zh 中文 zh-Hans-CN
wu 吴语 zh-wuu-Hans-CN
th ภาษาไทย th-TH
ja 日本語 ja-JP
ko 한국어 ko-KR

All segments (or more formally, subtags) in language codes above can be found at IANA Language Subtag Registry.

Apart from these hard-wired principles, I have also taken advice listed on W3C: Choosing a language tag, that my site prefers extended languages. For example, the subtag wuu for Wu Chinese is listed as both language and extlang in the registry:

Type: language
Subtag: wuu
Description: Wu Chinese
Added: 2009-07-29
Macrolanguage: zh
%%
Type: extlang
Subtag: wuu
Description: Wu Chinese
Added: 2009-07-29
Preferred-Value: wuu
Prefix: zh
Macrolanguage: zh

In such a case, the second wuu subtag is preferred and its macrolanguage will be included as a prefix, like zh-wuu.

CJK Mixed with Western#

In formal scenarios, Western words are separated by one space between words. If there is any punctuation and it is not followed by a boundary, one space will also be used after it, for example:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque mollis erat eu diam pharetra luctus. Vestibulum a scelerisque libero. Orci varius natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Integer ac congue lorem. Praesent quis tempor diam. Mauris rhoncus, risus nec aliquet consequat, massa sem blandit risus, porttitor venenatis urna est nec nulla. In hac habitasse platea dictumst. Pellentesque pharetra sollicitudin tortor, non commodo arcu.

Quisque interdum ligula lobortis interdum volutpat. Vivamus at mi vel ex luctus dictum quis quis ipsum. Nunc bibendum dui a enim pharetra, sit amet condimentum erat ultricies. Nunc a leo aliquet, euismod metus eu, malesuada lacus. Sed sed libero erat. Morbi non ante a lorem maximus volutpat ac sit amet turpis. Sed at pulvinar mi, ut tristique enim. Integer non mauris id felis vehicula varius vitae nec velit. Pellentesque in bibendum odio. Ut nec mattis felis. Aliquam at euismod nulla. Mauris ac rhoncus nibh. Proin accumsan lorem a dui blandit aliquam.

In cases where CJK characters are mixed with Western characters, it is difficult to make the character spacing perfect, hence adding extra spaces is needed. When the main language is CJK, there should be one space before and after the inserted Western characters unless it encounters punctuation or boundaries, for example:

大家都知道 VAC 是 Valve 公司开发的游戏反作弊系统,被普遍应用于 Valve 公司
旗下的游戏,例如 CS:GO 和 Team Fortress 2,此外也有相当一部分第三方游戏
选用其作为反作弊手段,例如 Rust 和 Unturned.

Math Formula#

Notations#

Mathematical notations always have came from some common practice, which has exactly the downside: mathematically illiterates (as per a very specific branch of mathematics) waste tons of minutes to just get these notations right, even if they actually know the essence of that math topics.

Assuming readers already have a mathematics level above middle school and are familiar with the following symbols’ meaning, most notations will not be explained in other articles. Some will be pointed out separately in specific articles according to different contexts for emphasis. In articles containing certain mathematical content, there will be a guide link to this section.

Notation Meaning
$$0,1,2,3,4,5,6,7,8,9$$ Main alphabet in decimal system
$$\mathbb{N},\mathbb{Z},\mathbb{Q},\mathbb{R},\mathbb{C}$$ Common number sets
$$\mathbb{P}$$ Prime number set
$$\mathbb{Z}/n\mathbb{Z}$$ The ring of integers modulo $n$
$$(\mathbb{Z}/n\mathbb{Z})^*$$ The multiplicative group of integers modulo $n$
$$n \choose k$$ The number of ways to choose $k$ items from a set of $n$ items
$$\Omega(n),\Theta(n),O(n)$$ Asymptotic notation, common in complexity analysis
$$:=,:\equiv$$ (defined to be) equal to, (defined to be) equivalent / congruent to