亦女念什么| 王八和乌龟有什么区别| 手电筒的金属外壳相当于电路中的什么| 四级专家是什么级别| leep是什么手术| 代谢不好是什么原因| 嗓子疼低烧吃什么药| 小腹胀痛是什么原因| 雨污分流什么意思| 0tc是什么意思| 云南什么族| 跳蛛吃什么| 三人死亡属于什么事故| 为什么一生气就胃疼| 85年属牛是什么命| 网调是什么意思| 宫颈非典型鳞状细胞是什么意思| 大姨妈没来是什么原因| faleda是什么牌子的手表| 胃间质瘤为什么不建议切除| 卵泡生成素高是什么原因| 疹子长什么样| 干咳是什么原因| 咳嗽有痰是什么原因| 鞘是什么意思| 不放屁吃什么药能通气| 豆浆喝多了有什么坏处| 扁桃体溃疡吃什么药| 无下限是什么意思| 涟漪什么意思| 狗吃什么食物| 子宫脱垂什么症状| 小麦过敏可以用什么代替面食| 未分类结晶偏高是什么意思| 马齿苋什么人不能吃| 2049年是什么年| 狐狸的尾巴有什么作用| 气血淤堵吃什么药| 08是什么生肖| 月经期后是什么期| 隐血阳性什么意思| 气管炎吃什么药好| 人为什么会抽筋| copd是什么病的简称| 尿酸偏高是什么原因| 28.88红包代表什么意思| 尿崩症吃什么药最有效| 下午2点是什么时辰| 猪横利是什么| dx是什么| 碧池是什么意思| 毛字出头念什么| 西装外套配什么裤子| 甘胆酸是查什么的| 一什么清香| 白天不懂夜的黑是什么意思| 祈字五行属什么| 吃什么长卵泡| 肉炒什么菜谱大全| 肉卷炒什么菜好吃| 真菌最怕什么| 省略号的作用是什么| 舒筋健腰丸主治什么| 头晕挂什么科比较好| 脑供血不足用什么药好| rr过高是什么意思| 孟买血型是什么意思| a21和以纯什么关系| 肝炎吃什么药好| 神仙眷侣是什么意思| 做hpv检查前要注意什么| 女性膀胱炎吃什么药| 肌酐是什么| 周杰伦什么病| 胃充盈欠佳是什么意思| 妇科支原体感染吃什么药| 全科医学科是什么科| 昂字五行属什么| 1962年五行属什么| 为什么会心悸| 鼻孔里面痒是什么原因| 不易是什么意思| 什么是尿频| 十二月是什么星座| 外阴瘙痒吃什么药| loreal是什么品牌| 家里放什么最招财| 坐飞机不能带什么东西| 锖色是什么颜色| 血稠是什么原因引起的| 早上起来眼睛肿是什么原因| 饴糖是什么糖| 头晕拉肚子是什么情况| 吃什么滋阴效果最好| 什么烟比较好抽| 高抬腿运动有什么好处| 什么样子| 什么饺子馅好吃| 空窗期是什么意思| 甲功不正常有什么表现| 取环后要注意什么事项| 胆囊结石挂什么科| 怀孕后不能吃什么| 出生证号是什么| 腋下是什么部位| 医院规培是什么意思| 夯实是什么意思| 梦见挖土豆是什么意思| 为什么不建议开眼角| 肠镜前一天可以吃什么| 逍遥丸什么时候吃| 小孩腿抽筋是什么原因引起的| 胃溃疡能吃什么| 家里什么东西止血最快| 共济失调是什么意思| 宝宝经常发烧是什么原因引起的| 肠功能紊乱吃什么药| 输血浆主要起什么作用| 鱼油什么时间吃最好| 怀孕肚皮痒是什么原因| 爬山带什么食物比较好| 苑什么意思| 疱疹性咽峡炎用什么药| 血液是什么组织| 可乐杀精是什么意思| 皈依什么意思| 每个月月经都提前是什么原因| 鹿晗女朋友叫什么名字| 矢车菊在中国叫什么名| gdp是什么意思| 白癜风不能吃什么食物| ep什么意思| 未时左眼跳是什么预兆| 知行合一是什么意思| 蛐蛐吃什么食物| 扁平苔藓有什么症状| 女性口臭都是什么原因| ml是什么单位| 病是什么结构| 什么是三有保护动物| 孕妇做糖筛是检查什么| 海螺吃什么| 什么才是真正的情人| 女生的名字叫什么好听| 嘴唇起白皮是什么原因| 蚊子喜欢什么| 心气不足吃什么中成药| 汗臭和狐臭有什么区别怎么辨别| 父母有刑是什么意思| 女性真菌感染是什么原因造成的| 胃穿孔是什么原因引起的| 最聪明的狗是什么狗| 湖南省的简称是什么| 健胃消食片什么时候吃最好| 意什么深什么| 用眼过度用什么眼药水| 糖尿病能吃什么零食| 咽后壁淋巴滤泡增生吃什么药| 梦见自己离婚是什么预兆| 诺如病毒吃什么药好得快一点| 梦见青蛙是什么意思| 绝膑而亡是什么意思| mri是什么检查| 什么菜| 任正非用的什么手机| 镀18k金是什么意思| 什么是沉香木| 什么嫩芽| 月经稀发是什么意思| 染发有什么危害| 维c吃多了有什么副作用| 猫喜欢吃什么| 走之底的字与什么有关| 体温偏高的人说明什么| 医保卡是什么| 重视是什么意思| pe材质是什么| 亲密是什么意思| 淀粉在超市里叫什么| 火龙果不能和什么一起吃| 夜尿多吃什么药效果好| 渡情劫是什么意思| 羽毛球鞋什么牌子好| 顶到子宫是什么感觉| 左边肋骨下面是什么器官| 依云矿泉水为什么贵| 乌龟吃什么水果| fps是什么意思| 屎是黑色的是什么原因| 球是什么生肖| 胃不好吃什么水果好| 女人左下眼皮跳是什么预兆| 省亲是什么意思| 危日是什么意思| 正常的白带是什么样的| 冬枣什么时候上市| 脂肪肝不能吃什么| mri是什么检查项目| 什么然| 桑榆未晚是什么意思| 剖腹产吃什么下奶最快| 人活着有什么意思| 肾是干什么用的| 小柴胡颗粒主要治什么| 什么散步填词语| 血清铁蛋白高说明什么| 皮质醇高是什么原因| qjqj什么烟| 早上起床牙龈出血是什么原因| 大云是什么| 达克宁栓治疗什么妇科病| 唇红是什么原因| 梦见到处都是蛇预示着什么| 宝宝辅食虾和什么搭配| 50岁是什么之年| 初中老师需要什么学历| 胃酸吃什么食物好得快| 118什么意思| 耳朵响吃什么药| 尿痛什么原因引起的| 玉溪烟属于什么档次| 事宜愿为是什么意思| 子宫内膜息肉吃什么药| b端和c端是什么意思| 尿液中有白色沉淀物是什么原因| 脑供血不足挂什么科| 驴胶补血颗粒什么时候喝最好| 崩是什么意思| 射手座的幸运色是什么颜色| dpo是什么意思| 耀武扬威的意思是什么| 回民不能吃什么| 布加综合征是什么病| 三个虫念什么| 梦见牛肉有什么征兆| 不全骨折是什么意思| 经常恶心干呕是什么原因| 蛋白粉什么时候吃最好| 喉咙突然哑了什么原因| 空调外机风扇不转是什么原因| 脑震荡什么症状| 菡什么意思| 什么不安| 痛风什么引起的原因有哪些| 宝宝拉肚子挂什么科| 陶渊明什么朝代| freeze是什么意思| 梦见捡菌子是什么预兆| 蕴字五行属什么| 扁桃体2度是什么意思| 五一广场有什么好玩的| porsche是什么牌子的车| pdc是什么意思| 满江红属于什么植物| 猪肉不能和什么一起吃| 吃鹅蛋对孕妇有什么好处| 温婉是什么意思| 菜板什么材质的好| 卡西欧手表什么档次| 运六月有什么说法| 反酸吃什么食物好| 铅是什么东西| 双子座男和什么座最配对| 肝郁症是什么病| 百度Jump to content

· 全国超八成地区启动大病医保

From Wikipedia, the free encyclopedia
This Python code is shown with coloring that highlights syntactic aspects.
百度   这个冬天,北京常常蓝天通透、空气清新,令许多人喜出望外。

The syntax of computer source code is the form that it has – specifically without concern for what it means (semantics). Like a natural language, a computer language (i.e. a programming language) defines the syntax that is valid for that language.[1] A syntax error occurs when syntactically invalid source code is processed by an tool such as a compiler or interpreter.

The most commonly used languages are text-based with syntax based on sequences of characters. Alternatively, the syntax of a visual programming language is based on relationships between graphical elements.

When designing the syntax of a language, a designer might start by writing down examples of both legal and illegal strings, before trying to figure out the general rules from these examples.[2]

Levels of syntax

[edit]

Computer language syntax is generally distinguished into three levels:

  • Words – the lexical level, determining how characters form tokens;
  • Phrases – the grammar level, narrowly speaking, determining how tokens form phrases;
  • Context – determining what objects or variables names refer to, if types are valid, etc.

Distinguishing in this way yields modularity, allowing each level to be described and processed separately and often independently.

First, a lexer turns the linear sequence of characters into a linear sequence of tokens; this is known as "lexical analysis" or "lexing".[3]

Second, the parser turns the linear sequence of tokens into a hierarchical syntax tree; this is known as "parsing" narrowly speaking. This ensures that the line of tokens conform to the formal grammars of the programming language. The parsing stage itself can be divided into two parts: the parse tree, or "concrete syntax tree", which is determined by the grammar, but is generally far too detailed for practical use, and the abstract syntax tree (AST), which simplifies this into a usable form. The AST and contextual analysis steps can be considered a form of semantic analysis, as they are adding meaning and interpretation to the syntax, or alternatively as informal, manual implementations of syntactical rules that would be difficult or awkward to describe or implement formally.

Thirdly, the contextual analysis resolves names and checks types. This modularity is sometimes possible, but in many real-world languages an earlier step depends on a later step – for example, the lexer hack in C is because tokenization depends on context. Even in these cases, syntactical analysis is often seen as approximating this ideal model.

The levels generally correspond to levels in the Chomsky hierarchy. Words are in a regular language, specified in the lexical grammar, which is a Type-3 grammar, generally given as regular expressions. Phrases are in a context-free language (CFL), generally a deterministic context-free language (DCFL), specified in a phrase structure grammar, which is a Type-2 grammar, generally given as production rules in Backus–Naur form (BNF). Phrase grammars are often specified in much more constrained grammars than full context-free grammars, in order to make them easier to parse; while the LR parser can parse any DCFL in linear time, the simple LALR parser and even simpler LL parser are more efficient, but can only parse grammars whose production rules are constrained. In principle, contextual structure can be described by a context-sensitive grammar, and automatically analyzed by means such as attribute grammars, though, in general, this step is done manually, via name resolution rules and type checking, and implemented via a symbol table which stores names and types for each scope.

Tools have been written that automatically generate a lexer from a lexical specification written in regular expressions and a parser from the phrase grammar written in BNF: this allows one to use declarative programming, rather than need to have procedural or functional programming. A notable example is the lex-yacc pair. These automatically produce a concrete syntax tree; the parser writer must then manually write code describing how this is converted to an abstract syntax tree. Contextual analysis is also generally implemented manually. Despite the existence of these automatic tools, parsing is often implemented manually, for various reasons – perhaps the phrase structure is not context-free, or an alternative implementation improves performance or error-reporting, or allows the grammar to be changed more easily. Parsers are often written in functional languages, such as Haskell, or in scripting languages, such as Python or Perl, or in C or C++.

Syntax definition

[edit]
Parse tree of Python code with inset tokenization

The syntax of textual programming languages is usually defined using a combination of regular expressions (for lexical structure) and Backus–Naur form (a metalanguage for grammatical structure) to inductively specify syntactic categories (nonterminal) and terminal symbols.[4] Syntactic categories are defined by rules called productions, which specify the values that belong to a particular syntactic category.[1] Terminal symbols are the concrete characters or strings of characters (for example keywords such as define, if, let, or void) from which syntactically valid programs are constructed.

Syntax can be divided into context-free syntax and context-sensitive syntax.[4] Context-free syntax are rules directed by the metalanguage of the programming language. These would not be constrained by the context surrounding or referring that part of the syntax, whereas context-sensitive syntax would.

A language can have different equivalent grammars, such as equivalent regular expressions (at the lexical levels), or different phrase rules which generate the same language. Using a broader category of grammars, such as LR grammars, can allow shorter or simpler grammars compared with more restricted categories, such as LL grammar, which may require longer grammars with more rules. Different but equivalent phrase grammars yield different parse trees, though the underlying language (set of valid documents) is the same.

Example: Lisp S-expressions

[edit]

Below is a simple grammar, defined using the notation of regular expressions and Extended Backus–Naur form. It describes the syntax of S-expressions, a data syntax of the programming language Lisp, which defines productions for the syntactic categories expression, atom, number, symbol, and list:

expression = atom   | list
atom       = number | symbol    
number     = [+-]?['0'-'9']+
symbol     = ['A'-'Z']['A'-'Z''0'-'9'].*
list       = '(', expression*, ')'

This grammar specifies the following:

  • an expression is either an atom or a list;
  • an atom is either a number or a symbol;
  • a number is an unbroken sequence of one or more decimal digits, optionally preceded by a plus or minus sign;
  • a symbol is a letter followed by zero or more of any characters (excluding whitespace); and
  • a list is a matched pair of parentheses, with zero or more expressions inside it.

Here the decimal digits, upper- and lower-case characters, and parentheses are terminal symbols.

The following are examples of well-formed token sequences in this grammar: '12345', '()', '(A B C232 (1))'

Complex grammars

[edit]

The grammar needed to specify a programming language can be classified by its position in the Chomsky hierarchy. The phrase grammar of most programming languages can be specified using a Type-2 grammar, i.e., they are context-free grammars,[5] though the overall syntax is context-sensitive (due to variable declarations and nested scopes), hence Type-1. However, there are exceptions, and for some languages the phrase grammar is Type-0 (Turing-complete).

In some languages like Perl and Lisp the specification (or implementation) of the language allows constructs that execute during the parsing phase. Furthermore, these languages have constructs that allow the programmer to alter the behavior of the parser. This combination effectively blurs the distinction between parsing and execution, and makes syntax analysis an undecidable problem in these languages, meaning that the parsing phase may not finish. For example, in Perl it is possible to execute code during parsing using a BEGIN statement, and Perl function prototypes may alter the syntactic interpretation, and possibly even the syntactic validity of the remaining code.[6][7] Colloquially this is referred to as "only Perl can parse Perl" (because code must be executed during parsing, and can modify the grammar), or more strongly "even Perl cannot parse Perl" (because it is undecidable). Similarly, Lisp macros introduced by the defmacro syntax also execute during parsing, meaning that a Lisp compiler must have an entire Lisp run-time system present. In contrast, C macros are merely string replacements, and do not require code execution.[8][9]

Syntax versus semantics

[edit]

The syntax of a language describes the form of a valid program, but does not provide any information about the meaning of the program or the results of executing that program. The meaning given to a combination of symbols is handled by semantics (either formal or hard-coded in a reference implementation). Valid syntax must be established before semantics can make meaning out of it.[4] Not all syntactically correct programs are semantically correct. Many syntactically correct programs are nonetheless ill-formed, per the language's rules; and may (depending on the language specification and the soundness of the implementation) result in an error on translation or execution. In some cases, such programs may exhibit undefined behavior. Even when a program is well-defined within a language, it may still have a meaning that is not intended by the person who wrote it.

Using natural language as an example, it may not be possible to assign a meaning to a grammatically correct sentence or the sentence may be false:

  • "Colorless green ideas sleep furiously." is grammatically well formed but has no generally accepted meaning.
  • "John is a married bachelor." is grammatically well formed but expresses a meaning that cannot be true.

The following C language fragment is syntactically correct, but performs an operation that is not semantically defined (because p is a null pointer, the operations p->real and p->im have no meaning):

 complex *p = NULL;
 complex abs_p = sqrt (p->real * p->real + p->im * p->im);

As a simpler example,

 int x;
 printf("%d", x);

is syntactically valid, but not semantically defined, as it uses an uninitialized variable. Even though compilers for some programming languages (e.g., Java and C#) would detect uninitialized variable errors of this kind, they should be regarded as semantic errors rather than syntax errors.[10][11]

See also

[edit]
  • Comparison of programming languages (syntax)
  • Naming convention (programming)
  • "Hello, World!" program

References

[edit]
  1. ^ a b Friedman, Daniel P.; Mitchell Wand; Christopher T. Haynes (1992). Essentials of Programming Languages (1st ed.). The MIT Press. ISBN 0-262-06145-7.
  2. ^ Smith, Dennis (1999). Designing Maintainable Software. Springer Science & Business Media.
  3. ^ Pai, Vaikunta; Aithal, P.S. (December 31, 2020). "A Systematic Literature Review of Lexical Analyzer Implementation Techniques in Compiler Design". International Journal of Applied Engineering and Management Letters. 4 (2): 285–301. doi:10.47992/IJAEML.2581.7000.0087. ISSN 2581-7000. SSRN 3770588.
  4. ^ a b c Sloneggger, Kenneth; Kurtz, Barry (1995). Formal Syntax and Semantics of Programming Languages. Addison-Wesley Publishing Company. ISBN 0-201-65697-3.
  5. ^ Michael Sipser (1997). "2.2 Pushdown Automata". Introduction to the Theory of Computation. PWS Publishing. pp. 101–114. ISBN 0-534-94728-X.
  6. ^ LtU comment clarifying that the undecidable problem is membership in the class of Perl programs
  7. ^ chromatic's example of Perl code that gives a syntax error depending on the value of random variable
  8. ^ "An Introduction to Common Lisp Macros". Apl.jhu.edu. 2025-08-06. Archived from the original on 2025-08-06. Retrieved 2025-08-06.
  9. ^ "The Common Lisp Cookbook - Macros and Backquote". Cl-cookbook.sourceforge.net. 2025-08-06. Retrieved 2025-08-06.
  10. ^ Semantic Errors in Java
  11. ^ Issue of syntax or semantics?
[edit]
虬结什么意思 human是什么意思 屁多且臭是什么原因 五道杠是什么牌子 怡什么意思
chloe是什么意思 英红九号是什么茶 说话声音小是什么原因 什么茶下火 双鱼和什么星座最配
银饰为什么会变黑 梦见大火烧房子是什么意思 检查贫血做什么检查 精子什么味道 雨字头的字有什么
天灵盖是什么意思 合是什么生肖 生物技术专业学什么 中性粒细胞百分比偏低是什么意思 农夫与蛇是什么故事
生僻字什么意思hcv9jop2ns6r.cn 耳朵堵塞感是什么原因hcv9jop4ns3r.cn 牛栏坑肉桂属于什么茶hcv9jop1ns5r.cn 二月是什么星座hcv8jop7ns7r.cn 5月20日是什么日子ff14chat.com
肚子胀是什么原因引起的hcv9jop2ns6r.cn 头晕头重昏昏沉沉是什么原因hcv9jop8ns2r.cn 什么品牌油烟机好chuanglingweilai.com 睡不着觉吃什么药效果好hcv9jop2ns9r.cn 白带发绿是什么原因hcv9jop0ns3r.cn
孕妇缺营养吃什么补hcv9jop3ns8r.cn 邓字五行属什么hcv8jop8ns7r.cn 乳腺癌挂什么科hcv7jop6ns6r.cn 新车上牌需要什么资料hcv8jop7ns4r.cn 彰字五行属什么hcv8jop5ns7r.cn
手掌有痣代表什么liaochangning.com ur是什么意思hcv7jop5ns5r.cn 经常呛咳是什么病的征兆hcv8jop5ns5r.cn 尿沉渣检查什么hkuteam.com 双手抽筋是什么原因hcv9jop5ns1r.cn
百度