podman
244 строки · 9.6 Кб
1// Copyright 2019 The Go Authors. All rights reserved.
2// Use of this source code is governed by a BSD-style
3// license that can be found in the LICENSE file.
4
5/*
6Package ppc64 implements a PPC64 assembler that assembles Go asm into
7the corresponding PPC64 instructions as defined by the Power ISA 3.0B.
8
9This document provides information on how to write code in Go assembler
10for PPC64, focusing on the differences between Go and PPC64 assembly language.
11It assumes some knowledge of PPC64 assembler. The original implementation of
12PPC64 in Go defined many opcodes that are different from PPC64 opcodes, but
13updates to the Go assembly language used mnemonics that are mostly similar if not
14identical to the PPC64 mneumonics, such as VMX and VSX instructions. Not all detail
15is included here; refer to the Power ISA document if interested in more detail.
16
17Starting with Go 1.15 the Go objdump supports the -gnu option, which provides a
18side by side view of the Go assembler and the PPC64 assembler output. This is
19extremely helpful in determining what final PPC64 assembly is generated from the
20corresponding Go assembly.
21
22In the examples below, the Go assembly is on the left, PPC64 assembly on the right.
23
241. Operand ordering
25
26In Go asm, the last operand (right) is the target operand, but with PPC64 asm,
27the first operand (left) is the target. The order of the remaining operands is
28not consistent: in general opcodes with 3 operands that perform math or logical
29operations have their operands in reverse order. Opcodes for vector instructions
30and those with more than 3 operands usually have operands in the same order except
31for the target operand, which is first in PPC64 asm and last in Go asm.
32
33Example:
34ADD R3, R4, R5 <=> add r5, r4, r3
35
362. Constant operands
37
38In Go asm, an operand that starts with '$' indicates a constant value. If the
39instruction using the constant has an immediate version of the opcode, then an
40immediate value is used with the opcode if possible.
41
42Example:
43ADD $1, R3, R4 <=> addi r4, r3, 1
44
453. Opcodes setting condition codes
46
47In PPC64 asm, some instructions other than compares have variations that can set
48the condition code where meaningful. This is indicated by adding '.' to the end
49of the PPC64 instruction. In Go asm, these instructions have 'CC' at the end of
50the opcode. The possible settings of the condition code depend on the instruction.
51CR0 is the default for fixed-point instructions; CR1 for floating point; CR6 for
52vector instructions.
53
54Example:
55ANDCC R3, R4, R5 <=> and. r5, r3, r4 (set CR0)
56
574. Loads and stores from memory
58
59In Go asm, opcodes starting with 'MOV' indicate a load or store. When the target
60is a memory reference, then it is a store; when the target is a register and the
61source is a memory reference, then it is a load.
62
63MOV{B,H,W,D} variations identify the size as byte, halfword, word, doubleword.
64
65Adding 'Z' to the opcode for a load indicates zero extend; if omitted it is sign extend.
66Adding 'U' to a load or store indicates an update of the base register with the offset.
67Adding 'BR' to an opcode indicates byte-reversed load or store, or the order opposite
68of the expected endian order. If 'BR' is used then zero extend is assumed.
69
70Memory references n(Ra) indicate the address in Ra + n. When used with an update form
71of an opcode, the value in Ra is incremented by n.
72
73Memory references (Ra+Rb) or (Ra)(Rb) indicate the address Ra + Rb, used by indexed
74loads or stores. Both forms are accepted. When used with an update then the base register
75is updated by the value in the index register.
76
77Examples:
78MOVD (R3), R4 <=> ld r4,0(r3)
79MOVW (R3), R4 <=> lwa r4,0(r3)
80MOVWZU 4(R3), R4 <=> lwzu r4,4(r3)
81MOVWZ (R3+R5), R4 <=> lwzx r4,r3,r5
82MOVHZ (R3), R4 <=> lhz r4,0(r3)
83MOVHU 2(R3), R4 <=> lhau r4,2(r3)
84MOVBZ (R3), R4 <=> lbz r4,0(r3)
85
86MOVD R4,(R3) <=> std r4,0(r3)
87MOVW R4,(R3) <=> stw r4,0(r3)
88MOVW R4,(R3+R5) <=> stwx r4,r3,r5
89MOVWU R4,4(R3) <=> stwu r4,4(r3)
90MOVH R4,2(R3) <=> sth r4,2(r3)
91MOVBU R4,(R3)(R5) <=> stbux r4,r3,r5
92
934. Compares
94
95When an instruction does a compare or other operation that might
96result in a condition code, then the resulting condition is set
97in a field of the condition register. The condition register consists
98of 8 4-bit fields named CR0 - CR7. When a compare instruction
99identifies a CR then the resulting condition is set in that field
100to be read by a later branch or isel instruction. Within these fields,
101bits are set to indicate less than, greater than, or equal conditions.
102
103Once an instruction sets a condition, then a subsequent branch, isel or
104other instruction can read the condition field and operate based on the
105bit settings.
106
107Examples:
108CMP R3, R4 <=> cmp r3, r4 (CR0 assumed)
109CMP R3, R4, CR1 <=> cmp cr1, r3, r4
110
111Note that the condition register is the target operand of compare opcodes, so
112the remaining operands are in the same order for Go asm and PPC64 asm.
113When CR0 is used then it is implicit and does not need to be specified.
114
1155. Branches
116
117Many branches are represented as a form of the BC instruction. There are
118other extended opcodes to make it easier to see what type of branch is being
119used.
120
121The following is a brief description of the BC instruction and its commonly
122used operands.
123
124BC op1, op2, op3
125
126op1: type of branch
12716 -> bctr (branch on ctr)
12812 -> bcr (branch if cr bit is set)
1298 -> bcr+bctr (branch on ctr and cr values)
1304 -> bcr != 0 (branch if specified cr bit is not set)
131
132There are more combinations but these are the most common.
133
134op2: condition register field and condition bit
135
136This contains an immediate value indicating which condition field
137to read and what bits to test. Each field is 4 bits long with CR0
138at bit 0, CR1 at bit 4, etc. The value is computed as 4*CR+condition
139with these condition values:
140
1410 -> LT
1421 -> GT
1432 -> EQ
1443 -> OVG
145
146Thus 0 means test CR0 for LT, 5 means CR1 for GT, 30 means CR7 for EQ.
147
148op3: branch target
149
150Examples:
151
152BC 12, 0, target <=> blt cr0, target
153BC 12, 2, target <=> beq cr0, target
154BC 12, 5, target <=> bgt cr1, target
155BC 12, 30, target <=> beq cr7, target
156BC 4, 6, target <=> bne cr1, target
157BC 4, 1, target <=> ble cr1, target
158
159The following extended opcodes are available for ease of use and readability:
160
161BNE CR2, target <=> bne cr2, target
162BEQ CR4, target <=> beq cr4, target
163BLT target <=> blt target (cr0 default)
164BGE CR7, target <=> bge cr7, target
165
166Refer to the ISA for more information on additional values for the BC instruction,
167how to handle OVG information, and much more.
168
1695. Align directive
170
171Starting with Go 1.12, Go asm supports the PCALIGN directive, which indicates
172that the next instruction should be aligned to the specified value. Currently
1738 and 16 are the only supported values, and a maximum of 2 NOPs will be added
174to align the code. That means in the case where the code is aligned to 4 but
175PCALIGN $16 is at that location, the code will only be aligned to 8 to avoid
176adding 3 NOPs.
177
178The purpose of this directive is to improve performance for cases like loops
179where better alignment (8 or 16 instead of 4) might be helpful. This directive
180exists in PPC64 assembler and is frequently used by PPC64 assembler writers.
181
182PCALIGN $16
183PCALIGN $8
184
185Functions in Go are aligned to 16 bytes, as is the case in all other compilers
186for PPC64.
187
1886. Shift instructions
189
190The simple scalar shifts on PPC64 expect a shift count that fits in 5 bits for
19132-bit values or 6 bit for 64-bit values. If the shift count is a constant value
192greater than the max then the assembler sets it to the max for that size (31 for
19332 bit values, 63 for 64 bit values). If the shift count is in a register, then
194only the low 5 or 6 bits of the register will be used as the shift count. The
195Go compiler will add appropriate code to compare the shift value to achieve the
196the correct result, and the assembler does not add extra checking.
197
198Examples:
199
200SRAD $8,R3,R4 => sradi r4,r3,8
201SRD $8,R3,R4 => rldicl r4,r3,56,8
202SLD $8,R3,R4 => rldicr r4,r3,8,55
203SRAW $16,R4,R5 => srawi r5,r4,16
204SRW $40,R4,R5 => rlwinm r5,r4,0,0,31
205SLW $12,R4,R5 => rlwinm r5,r4,12,0,19
206
207Some non-simple shifts have operands in the Go assembly which don't map directly
208onto operands in the PPC64 assembly. When an operand in a shift instruction in the
209Go assembly is a bit mask, that mask is represented as a start and end bit in the
210PPC64 assembly instead of a mask. See the ISA for more detail on these types of shifts.
211Here are a few examples:
212
213RLWMI $7,R3,$65535,R6 => rlwimi r6,r3,7,16,31
214RLDMI $0,R4,$7,R6 => rldimi r6,r4,0,61
215
216More recently, Go opcodes were added which map directly onto the PPC64 opcodes. It is
217recommended to use the newer opcodes to avoid confusion.
218
219RLDICL $0,R4,$15,R6 => rldicl r6,r4,0,15
220RLDICR $0,R4,$15,R6 => rldicr r6.r4,0,15
221
222Register naming
223
2241. Special register usage in Go asm
225
226The following registers should not be modified by user Go assembler code.
227
228R0: Go code expects this register to contain the value 0.
229R1: Stack pointer
230R2: TOC pointer when compiled with -shared or -dynlink (a.k.a position independent code)
231R13: TLS pointer
232R30: g (goroutine)
233
234Register names:
235
236Rn is used for general purpose registers. (0-31)
237Fn is used for floating point registers. (0-31)
238Vn is used for vector registers. Slot 0 of Vn overlaps with Fn. (0-31)
239VSn is used for vector-scalar registers. V0-V31 overlap with VS32-VS63. (0-63)
240CTR represents the count register.
241LR represents the link register.
242
243*/
244package ppc64
245