podman
249 строк · 8.8 Кб
1// Copyright 2018 The Go Authors. All rights reserved.
2// Use of this source code is governed by a BSD-style
3// license that can be found in the LICENSE file.
4
5/*
6Package arm64 implements an ARM64 assembler. Go assembly syntax is different from GNU ARM64
7syntax, but we can still follow the general rules to map between them.
8
9Instructions mnemonics mapping rules
10
111. Most instructions use width suffixes of instruction names to indicate operand width rather than
12using different register names.
13
14Examples:
15ADC R24, R14, R12 <=> adc x12, x24
16ADDW R26->24, R21, R15 <=> add w15, w21, w26, asr #24
17FCMPS F2, F3 <=> fcmp s3, s2
18FCMPD F2, F3 <=> fcmp d3, d2
19FCVTDH F2, F3 <=> fcvt h3, d2
20
212. Go uses .P and .W suffixes to indicate post-increment and pre-increment.
22
23Examples:
24MOVD.P -8(R10), R8 <=> ldr x8, [x10],#-8
25MOVB.W 16(R16), R10 <=> ldrsb x10, [x16,#16]!
26MOVBU.W 16(R16), R10 <=> ldrb x10, [x16,#16]!
27
283. Go uses a series of MOV instructions as load and store.
29
3064-bit variant ldr, str, stur => MOVD;
3132-bit variant str, stur, ldrsw => MOVW;
3232-bit variant ldr => MOVWU;
33ldrb => MOVBU; ldrh => MOVHU;
34ldrsb, sturb, strb => MOVB;
35ldrsh, sturh, strh => MOVH.
36
374. Go moves conditions into opcode suffix, like BLT.
38
395. Go adds a V prefix for most floating-point and SIMD instructions, except cryptographic extension
40instructions and floating-point(scalar) instructions.
41
42Examples:
43VADD V5.H8, V18.H8, V9.H8 <=> add v9.8h, v18.8h, v5.8h
44VLD1.P (R6)(R11), [V31.D1] <=> ld1 {v31.1d}, [x6], x11
45VFMLA V29.S2, V20.S2, V14.S2 <=> fmla v14.2s, v20.2s, v29.2s
46AESD V22.B16, V19.B16 <=> aesd v19.16b, v22.16b
47SCVTFWS R3, F16 <=> scvtf s17, w6
48
496. Align directive
50
51Go asm supports the PCALIGN directive, which indicates that the next instruction should be aligned
52to a specified boundary by padding with NOOP instruction. The alignment value supported on arm64
53must be a power of 2 and in the range of [8, 2048].
54
55Examples:
56PCALIGN $16
57MOVD $2, R0 // This instruction is aligned with 16 bytes.
58PCALIGN $1024
59MOVD $3, R1 // This instruction is aligned with 1024 bytes.
60
61PCALIGN also changes the function alignment. If a function has one or more PCALIGN directives,
62its address will be aligned to the same or coarser boundary, which is the maximum of all the
63alignment values.
64
65In the following example, the function Add is aligned with 128 bytes.
66Examples:
67TEXT ·Add(SB),$40-16
68MOVD $2, R0
69PCALIGN $32
70MOVD $4, R1
71PCALIGN $128
72MOVD $8, R2
73RET
74
75On arm64, functions in Go are aligned to 16 bytes by default, we can also use PCALGIN to set the
76function alignment. The functions that need to be aligned are preferably using NOFRAME and NOSPLIT
77to avoid the impact of the prologues inserted by the assembler, so that the function address will
78have the same alignment as the first hand-written instruction.
79
80In the following example, PCALIGN at the entry of the function Add will align its address to 2048 bytes.
81
82Examples:
83TEXT ·Add(SB),NOSPLIT|NOFRAME,$0
84PCALIGN $2048
85MOVD $1, R0
86MOVD $1, R1
87RET
88
89Special Cases.
90
91(1) umov is written as VMOV.
92
93(2) br is renamed JMP, blr is renamed CALL.
94
95(3) No need to add "W" suffix: LDARB, LDARH, LDAXRB, LDAXRH, LDTRH, LDXRB, LDXRH.
96
97(4) In Go assembly syntax, NOP is a zero-width pseudo-instruction serves generic purpose, nothing
98related to real ARM64 instruction. NOOP serves for the hardware nop instruction. NOOP is an alias of
99HINT $0.
100
101Examples:
102VMOV V13.B[1], R20 <=> mov x20, v13.b[1]
103VMOV V13.H[1], R20 <=> mov w20, v13.h[1]
104JMP (R3) <=> br x3
105CALL (R17) <=> blr x17
106LDAXRB (R19), R16 <=> ldaxrb w16, [x19]
107NOOP <=> nop
108
109
110Register mapping rules
111
1121. All basic register names are written as Rn.
113
1142. Go uses ZR as the zero register and RSP as the stack pointer.
115
1163. Bn, Hn, Dn, Sn and Qn instructions are written as Fn in floating-point instructions and as Vn
117in SIMD instructions.
118
119
120Argument mapping rules
121
1221. The operands appear in left-to-right assignment order.
123
124Go reverses the arguments of most instructions.
125
126Examples:
127ADD R11.SXTB<<1, RSP, R25 <=> add x25, sp, w11, sxtb #1
128VADD V16, V19, V14 <=> add d14, d19, d16
129
130Special Cases.
131
132(1) Argument order is the same as in the GNU ARM64 syntax: cbz, cbnz and some store instructions,
133such as str, stur, strb, sturb, strh, sturh stlr, stlrb. stlrh, st1.
134
135Examples:
136MOVD R29, 384(R19) <=> str x29, [x19,#384]
137MOVB.P R30, 30(R4) <=> strb w30, [x4],#30
138STLRH R21, (R19) <=> stlrh w21, [x19]
139
140(2) MADD, MADDW, MSUB, MSUBW, SMADDL, SMSUBL, UMADDL, UMSUBL <Rm>, <Ra>, <Rn>, <Rd>
141
142Examples:
143MADD R2, R30, R22, R6 <=> madd x6, x22, x2, x30
144SMSUBL R10, R3, R17, R27 <=> smsubl x27, w17, w10, x3
145
146(3) FMADDD, FMADDS, FMSUBD, FMSUBS, FNMADDD, FNMADDS, FNMSUBD, FNMSUBS <Fm>, <Fa>, <Fn>, <Fd>
147
148Examples:
149FMADDD F30, F20, F3, F29 <=> fmadd d29, d3, d30, d20
150FNMSUBS F7, F25, F7, F22 <=> fnmsub s22, s7, s7, s25
151
152(4) BFI, BFXIL, SBFIZ, SBFX, UBFIZ, UBFX $<lsb>, <Rn>, $<width>, <Rd>
153
154Examples:
155BFIW $16, R20, $6, R0 <=> bfi w0, w20, #16, #6
156UBFIZ $34, R26, $5, R20 <=> ubfiz x20, x26, #34, #5
157
158(5) FCCMPD, FCCMPS, FCCMPED, FCCMPES <cond>, Fm. Fn, $<nzcv>
159
160Examples:
161FCCMPD AL, F8, F26, $0 <=> fccmp d26, d8, #0x0, al
162FCCMPS VS, F29, F4, $4 <=> fccmp s4, s29, #0x4, vs
163FCCMPED LE, F20, F5, $13 <=> fccmpe d5, d20, #0xd, le
164FCCMPES NE, F26, F10, $0 <=> fccmpe s10, s26, #0x0, ne
165
166(6) CCMN, CCMNW, CCMP, CCMPW <cond>, <Rn>, $<imm>, $<nzcv>
167
168Examples:
169CCMP MI, R22, $12, $13 <=> ccmp x22, #0xc, #0xd, mi
170CCMNW AL, R1, $11, $8 <=> ccmn w1, #0xb, #0x8, al
171
172(7) CCMN, CCMNW, CCMP, CCMPW <cond>, <Rn>, <Rm>, $<nzcv>
173
174Examples:
175CCMN VS, R13, R22, $10 <=> ccmn x13, x22, #0xa, vs
176CCMPW HS, R19, R14, $11 <=> ccmp w19, w14, #0xb, cs
177
178(9) CSEL, CSELW, CSNEG, CSNEGW, CSINC, CSINCW <cond>, <Rn>, <Rm>, <Rd> ;
179FCSELD, FCSELS <cond>, <Fn>, <Fm>, <Fd>
180
181Examples:
182CSEL GT, R0, R19, R1 <=> csel x1, x0, x19, gt
183CSNEGW GT, R7, R17, R8 <=> csneg w8, w7, w17, gt
184FCSELD EQ, F15, F18, F16 <=> fcsel d16, d15, d18, eq
185
186(10) TBNZ, TBZ $<imm>, <Rt>, <label>
187
188
189(11) STLXR, STLXRW, STXR, STXRW, STLXRB, STLXRH, STXRB, STXRH <Rf>, (<Rn|RSP>), <Rs>
190
191Examples:
192STLXR ZR, (R15), R16 <=> stlxr w16, xzr, [x15]
193STXRB R9, (R21), R19 <=> stxrb w19, w9, [x21]
194
195(12) STLXP, STLXPW, STXP, STXPW (<Rf1>, <Rf2>), (<Rn|RSP>), <Rs>
196
197Examples:
198STLXP (R17, R19), (R4), R5 <=> stlxp w5, x17, x19, [x4]
199STXPW (R30, R25), (R22), R13 <=> stxp w13, w30, w25, [x22]
200
2012. Expressions for special arguments.
202
203#<immediate> is written as $<immediate>.
204
205Optionally-shifted immediate.
206
207Examples:
208ADD $(3151<<12), R14, R20 <=> add x20, x14, #0xc4f, lsl #12
209ADDW $1864, R25, R6 <=> add w6, w25, #0x748
210
211Optionally-shifted registers are written as <Rm>{<shift><amount>}.
212The <shift> can be <<(lsl), >>(lsr), ->(asr), @>(ror).
213
214Examples:
215ADD R19>>30, R10, R24 <=> add x24, x10, x19, lsr #30
216ADDW R26->24, R21, R15 <=> add w15, w21, w26, asr #24
217
218Extended registers are written as <Rm>{.<extend>{<<<amount>}}.
219<extend> can be UXTB, UXTH, UXTW, UXTX, SXTB, SXTH, SXTW or SXTX.
220
221Examples:
222ADDS R19.UXTB<<4, R9, R26 <=> adds x26, x9, w19, uxtb #4
223ADDSW R14.SXTX, R14, R6 <=> adds w6, w14, w14, sxtx
224
225Memory references: [<Xn|SP>{,#0}] is written as (Rn|RSP), a base register and an immediate
226offset is written as imm(Rn|RSP), a base register and an offset register is written as (Rn|RSP)(Rm).
227
228Examples:
229LDAR (R22), R9 <=> ldar x9, [x22]
230LDP 28(R17), (R15, R23) <=> ldp x15, x23, [x17,#28]
231MOVWU (R4)(R12<<2), R8 <=> ldr w8, [x4, x12, lsl #2]
232MOVD (R7)(R11.UXTW<<3), R25 <=> ldr x25, [x7,w11,uxtw #3]
233MOVBU (R27)(R23), R14 <=> ldrb w14, [x27,x23]
234
235Register pairs are written as (Rt1, Rt2).
236
237Examples:
238LDP.P -240(R11), (R12, R26) <=> ldp x12, x26, [x11],#-240
239
240Register with arrangement and register with arrangement and index.
241
242Examples:
243VADD V5.H8, V18.H8, V9.H8 <=> add v9.8h, v18.8h, v5.8h
244VLD1 (R2), [V21.B16] <=> ld1 {v21.16b}, [x2]
245VST1.P V9.S[1], (R16)(R21) <=> st1 {v9.s}[1], [x16], x28
246VST1.P [V13.H8, V14.H8, V15.H8], (R3)(R14) <=> st1 {v13.8h-v15.8h}, [x3], x14
247VST1.P [V14.D1, V15.D1], (R7)(R23) <=> st1 {v14.1d, v15.1d}, [x7], x23
248*/
249package arm64
250