Copy Link
Add to Bookmark
Report

Anomie's Register Doc

Dreamcast's profile picture
Published in 
SNES
 · 21 Apr 2024

Anomie's Register Doc
Revision: 1157
Date: 2007-07-12 16:39:41 -0400 (Thu, 12 Jul 2007)
<anomie@users.sourceforge.net>

Contents

  1. Registers
  2. Sprites
    • 2.1 OAM
    • 2.2 Palettes
    • 2.3 Character table in VRAM
    • 2.4 Sprite Priority
    • 2.5 Drawing the Sprites

  3. Backgrounds
    • 3.1 BG Modes
    • 3.2 Tile Maps and Character Maps
    • 3.3 BG Scrolling
    • 3.4 Direct Color Mode
    • 3.5 Mode 0
    • 3.6 Mode 1
    • 3.7 Mode 2
    • 3.8 Mode 3
    • 3.9 Mode 4
    • 3.10 Mode 5
    • 3.11 Mode 6
    • 3.12 Mode 7
    • 3.13 Rendering the BGs
    • 3.14 Unresolved Issues

  4. Windows
    • 4.1 The Color Window

  5. Rendering the screen
    • 5.1 Mosaic
    • 5.2 Color Math
    • 5.3 Rendering the Screen

  6. Controllers
    • 6.1 Generic
    • 6.2 "Open Port"
    • 6.3 Joypads
    • 6.4 Mouse
    • 6.5 SuperScope
    • 6.6 Justifiers
    • 6.7 MP5

  7. DMA and HDMA
    • 7.1 DMA
    • 7.2 HDMA

  8. History

Registers

Addr rw?fvha Name 
bits

Explanation

"Addr" is the address this register is mapped into the SNES memory space.
"Name" is the official and unofficial name of the register
"bits" is either 8 or 16 characters explicating the bitfields in this register.

The flags are:
rw?fvha
||||||+--> '+' if it can be read/written at any time, '-' otherwise
|||||+---> '+' if it can be read/written during H-Blank
||||+----> '+' if it can be read/written during V-Blank
|||+-----> '+' if it can be read/written during force-blank
||+------> Read/Write style: 'b' => byte
|| 'h'/'l' => read/write high/low byte of a word
|| 'w' => word read/write twice low then high
|+-------> 'w' if the register is writable for an effect
+--------> 'r' if the register is readable for a value or effect (i.e. not
open bus).

To find the entry for a particular register, search for the register number
(i.e. '2100') at the very beginning of the line. Note that the DMA registers
are combined, so e.g. to find $4300, $4310, $4320, $4330, $4340, $4350, $4360,
or $4370 you'd search for '43x0'.

For most registers (and most undefined bits of readable registers), the
returned value is Open Bus, that is the last value read over the main bus from
the ROM (typically part of the opcode arguments or the indirect base address).

Registers matching $21x4-6 or $21x8-A (where x is 0-2) return the last value
read from any of the PPU1 registers $2134-6, $2138-A, or $213E. This is known
as PPU1 Open Bus. Similarly, PPU2 Open Bus involves reading registers $213B-D
or $213F (NOT $21xB-D though).

Note that it may be possible to write registers anytime even if marked '-', but
until we have proof '-' is a better guess.

--------

2100 wb++++ INIDISP - Screen Display
x---bbbb

x = Force blank on when set.
bbbb = Screen brightness, F=max, 0="off".

Note that force blank CAN be disabled mid-scanline. However, this can
result in glitched graphics on that scanline, as the internal rendering
buffers will not have been updated during force blank. Current theory
is that BGs will be glitched for a few tiles (depending on how far in
advance the PPU operates), and OBJ will be glitched for the entire
scanline.

Also, writing this register on the first line of V-Blank (225 or 240,
depending on overscan) when force blank is currently active causes the
OAM Address Reset to occur.


2101 wb++?- OBSEL - Object Size and Chr Address
sssnnbbb

sss = Object size:
000 = 8x8 and 16x16 sprites
001 = 8x8 and 32x32 sprites
010 = 8x8 and 64x64 sprites
011 = 16x16 and 32x32 sprites
100 = 16x16 and 64x64 sprites
101 = 32x32 and 64x64 sprites
110 = 16x32 and 32x64 sprites ('undocumented')
111 = 16x32 and 32x32 sprites ('undocumented')

nn = Name Select
bbb = Name Base Select (Addr>>14)
See the section "SPRITES" below for details.


2102 wl++?- OAMADDL - OAM Address low byte
2103 wh++?- OAMADDH - OAM Address high bit and Obj Priority
p------b aaaaaaaa

p = Obj Priority activation bit
When this bit is set, an Obj other than Sprite 0 may be given
priority. See the section "SPRITES" below for details.

b aaaaaaaa = OAM address
This can be thought of in two ways, depending on your conception of
OAM. If you consider OAM as a 544-byte table, baaaaaaaa is the word
address into that table. If you consider OAM to be a 512-byte table
and a 32-byte table, b is the table selector and aaaaaaaa is the
word address in the table. See the section "SPRITES" below for
details.

The internal OAM address is invalidated when scanlines are being
rendered. This invalidation is deterministic, but we do not know how
it is determined. Thus, the last value written to these registers is
reloaded into the internal OAM address at the beginning of V-Blank if
that occurs outside of a force-blank period. This is known as 'OAM
reset'. 'OAM reset' also occurs on certain writes to $2100.

Writing to either $2102 or $2103 resets the entire internal OAM Address
to the values last written to this register. E.g., if you set $104 to
this register, write 4 bytes, then write $1 to $2103, the internal OAM
address will point to word 4, not word 6.


2104 wb++-- OAMDATA - Data for OAM write
dddddddd

Note that OAM writes are done in an odd manner, in particular
the low table of OAM is not affected until the high byte of a
word is written (however, the high table is affected
immediately). Thus, if you set the address, then alternate writes and
reads, OAM will never be affected until you reach the high table!

Similarly, if you set the address to 0, then write 1, 2, read, then
write 3, OAM will end up as "01 02 01 03", rather than "01 02 xx 03" as
you might expect.

Technically, this register CAN be written during H-blank (and probably
mid-scanline as well). However, due to OAM address invalidation the
actual OAM byte written will probably not be what you expect. Note that
writing during force-blank will only work as expected if that
force-blank was begun during V-Blank, or (probably) if $2102/3 have
been reset during that force-blank period.

See the section "SPRITES" below for details.


2105 wb+++- BGMODE - BG Mode and Character Size
DCBAemmm

A/B/C/D = BG character size for BG1/BG2/BG3/BG4
If the bit is set, then the BG is made of 16x16 tiles. Otherwise,
8x8 tiles are used. However, note that Modes 5 and 6 always use
16-pixel wide tiles, and Mode 7 always uses 8x8 tiles. See the
section "BACKGROUNDS" below for details.

mmm = BG Mode
e = Mode 1 BG3 priority bit
Mode BG depth OPT Priorities
1 2 3 4 Front -> Back
-=-------=-=-=-=----=---============---
0 2 2 2 2 n 3AB2ab1CD0cd
1 4 4 2 n 3AB2ab1C 0c
* if e set: C3AB2ab1 0c
2 4 4 y 3A 2B 1a 0b
3 8 4 n 3A 2B 1a 0b
4 8 2 y 3A 2B 1a 0b
5 4 2 n 3A 2B 1a 0b
6 4 y 3A 2 1a 0
7 8 n 3 2 1a 0
7+EXTBG 8 7 n 3 2B 1a 0b

"OPT" means "Offset-per-tile mode". For the priorities, numbers
mean sprites with that priority. Letters correspond to BGs (A=1,
B=2, etc), with upper/lower case indicating tile priority 1/0. See
the section "BACKGROUNDS" below for details.

Mode 7's EXTBG mode allows you to enable BG2, which uses the same
tilemap and character data as BG1 but interprets bit 7 of the pixel
data as a priority bit. BG2 also has some oddness to do with some
of the per-BG registers below. See the Mode 7 section under
BACKGROUNDS for details.


2106 wb+++- MOSAIC - Screen Pixelation
xxxxDCBA

A/B/C/D = Affect BG1/BG2/BG3/BG4

xxxx = pixel size, 0=1x1, F=16x16
The mosaic filter goes over the BG and covers each x-by-x square
with the upper-left pixel of that square, with the top of the first
row of squares on the 'starting scanline'. If this register is set
during the frame, the 'starting scanline' is the current scanline,
otherwise it is the first visible scanline of the frame. I.e. if
even scanlines are completely red and odd scanlines are completely
blue, setting the xxxx=1 mid-frame will make the rest of the screen
either completely red or completely blue depending on whether you
set xxxx on an even or an odd scanline.

XXX: It seems that writing the same value to this register does not
reset the 'starting scanline', but which changes do reset it?

Note that mosaic is applied after scrolling, but before any clip
windows, color windows, or math. So the XxX block can be partially
clipped, and it can be mathed as normal with a non-mosaiced BG. But
scrolling can't make it partially one color and partially another.

Modes 5-6 should 'double' the expansion factor to expand
half-pixels. This actually makes xxxx=0 have a visible effect,
since the even half-pixels (usually on the subscreen) hide the odd
half-pixels. The same thing happens vertically with interlace mode.

Mode 7, of course, is weird. BG1 mosaics about like normal, as long
as you remember that the Mode 7 transformations have no effect on
the XxX blocks. BG2 uses bit A to control 'vertical mosaic' and bit
B to control 'horizontal mosaic', so you could be expanding over
1xX, Xx1, or XxX blocks. This can get really interesting as BG1
still uses bit A as normal, so you could have the BG1 pixels
expanded XxX with high-priority BG2 pixels expanded 1xX on top of
them.

See the section "BACKGROUNDS" below for details.


2107 wb++?- BG1SC - BG1 Tilemap Address and Size
2108 wb++?- BG2SC - BG2 Tilemap Address and Size
2109 wb++?- BG3SC - BG3 Tilemap Address and Size
210a wb++?- BG4SC - BG4 Tilemap Address and Size
aaaaaayx

aaaaaa = Tilemap address in VRAM (Addr>>10)
x = Tilemap horizontal mirroring
y = Tilemap veritcal mirroring
All tilemaps are 32x32 tiles. If x and y are both unset, there is
one tilemap at Addr. If x is set, a second tilemap follows the
first that should be considered "to the right of" the first. If y
is set, a second tilemap follows the first that should be
considered "below" the first. If both are set, then a second
follows "to the right", then a third "below", and a fourth "below
and to the right".

See the section "BACKGROUNDS" below for more details.


210b wb++?- BG12NBA - BG1 and 2 Chr Address
210c wb++?- BG34NBA - BG3 and 4 Chr Address
bbbbaaaa

aaaa = Base address for BG1/3 (Addr>>13)
bbbb = Base address for BG2/4 (Addr>>13)
See the section "BACKGROUNDS" below for details.


210d ww+++- BG1HOFS - BG1 Horizontal Scroll
ww+++- M7HOFS - Mode 7 BG Horizontal Scroll
210e ww+++- BG1VOFS - BG1 Vertical Scroll
ww+++- M7VOFS - Mode 7 BG Vertical Scroll
------xx xxxxxxxx
---mmmmm mmmmmmmm

x = The BG offset, 10 bits.
m = The Mode 7 BG offset, 13 bits two's-complement signed.

These are actually two registers in one (or would that be "4 registers
in 2"?). Anyway, writing $210d will write both BG1HOFS which works
exactly like the rest of the BGnxOFS registers below ($210f-$2114), and
M7HOFS which works with the M7* registers ($211b-$2120) instead.

Modes 0-6 use BG1xOFS and ignore M7xOFS, while Mode 7 uses M7xOFS and
ignores BG1HOFS. See the appropriate sections below for details, and
note the different formulas for BG1HOFS versus M7HOFS.


210f ww+++- BG2HOFS - BG2 Horizontal Scroll
2110 ww+++- BG2VOFS - BG2 Vertical Scroll
2111 ww+++- BG3HOFS - BG3 Horizontal Scroll
2112 ww+++- BG3VOFS - BG3 Vertical Scroll
2113 ww+++- BG4HOFS - BG4 Horizontal Scroll
2114 ww+++- BG4VOFS - BG4 Vertical Scroll
------xx xxxxxxxx

Note that these are "write twice" registers, first the low byte is
written then the high. Current theory is that writes to the register
work like this:
BGnHOFS = (Current<<8) | (Prev&~7) | ((Reg>>8)&7);
Prev = Current;
or
BGnVOFS = (Current<<8) | Prev;
Prev = Current;

Note that there is only one Prev shared by all the BGnxOFS registers.
This is NOT shared with the M7* registers (not even M7xOFS and
BG1xOFS).

x = The BG offset, at most 10 bits (some modes effectively use as few
as 8).

Note that all BGs wrap if you try to go past their edges. Thus, the
maximum offset value in BG Modes 0-6 is 1023, since you have at most 64
tiles (if x/y of BGnSC is set) of 16 pixels each (if the appropriate
bit of BGMODE is set).

Horizontal scrolling scrolls in units of full pixels no matter if we're
rendering a 256-pixel wide screen or a 512-half-pixel wide screen.
However, vertical scrolling will move in half-line increments if
interlace mode is active.

See the section "BACKGROUNDS" below for details.


2115 wb++?- VMAIN - Video Port Control
i---mmii

i = Address increment mode:
0 => increment after writing $2118/reading $2139
1 => increment after writing $2119/reading $213a
Note that a word write stores low first, then high. Thus, if you're
storing a word value to $2118/9, you'll probably want to set 1
here.

ii = Address increment amount
00 = Normal increment by 1
01 = Increment by 32
10 = Increment by 128
11 = Increment by 128

mm = Address remapping
00 = No remapping
01 = Remap addressing aaaaaaaaBBBccccc => aaaaaaaacccccBBB
10 = Remap addressing aaaaaaaBBBcccccc => aaaaaaaccccccBBB
11 = Remap addressing aaaaaaBBBccccccc => aaaaaacccccccBBB

The "remap" modes basically implement address translation. If
$2116/7 are set to #$0003, then word address #$0018 will be written
instead, and $2116/7 will be incremented to $0004.


2116 wl++?- VMADDL - VRAM Address low byte
2117 wh++?- VMADDH - VRAM Address high byte
aaaaaaaa aaaaaaaa

This sets the address for $2118/9 and $2139/a. Note that this is a word
address, not a byte address!

See the sections "BACKGROUNDS" and "SPRITES" below for details.


2118 wl++-- VMDATAL - VRAM Data Write low byte
2119 wh++-- VMDATAH - VRAM Data Write high byte
xxxxxxxx xxxxxxxx

This writes data to VRAM. The writes take effect immediately(?), even
if no increment is performed. The address is incremented when one of
the two bytes is written; which one depends on the setting of bit 7 of
register $2115. Keep in mind the address translation bits of $2115 as
well.

The interaction between these registers and $2139/a is unknown.

See the sections "BACKGROUNDS" and "SPRITES" below for details.


211a wb++?- M7SEL - Mode 7 Settings
rc----yx

r = Playing field size: When clear, the playing field is 1024x1024
pixels (so the tilemap completely fills it). When set, the playing
field is much larger, and the 'empty space' fill is controlled by
bit 6.

c = Empty space fill, when bit 7 is set:
0 = Transparent.
1 = Fill with character 0. Note that the fill is matrix
transformed like all other Mode 7 tiles.

x/y = Horizontal/Veritcal mirroring. If the bit is set, flip the
256x256 pixel 'screen' in that direction.

See the section "BACKGROUNDS" below for details.


211b ww+++- M7A - Mode 7 Matrix A (also used with $2134/6)
211c ww+++- M7B - Mode 7 Matrix B (also used with $2134/6)
211d ww+++- M7C - Mode 7 Matrix C
211e ww+++- M7D - Mode 7 Matrix D
aaaaaaaa aaaaaaaa

Note that these are "write twice" registers, first the low byte is
written then the high. Current theory is that writes to the register
work like this:
Reg = (Current<<8) | Prev;
Prev = Current;

Note that there is only one Prev shared by all these registers. This
Prev is NOT shared with the BGnxOFS registers, but it IS shared with
the M7xOFS registers.

These set the matrix parameters for Mode 7. The values are an 8-bit
fixed point, i.e. the value should be divided by 256.0 when used in
calculations. See below for more explanation.

The product A*(B>>8) may be read from registers $2134/6. There is
supposedly no important delay. It may not be operative during Mode 7
rendering.

See the section "BACKGROUNDS" below for details.


211f ww+++- M7X - Mode 7 Center X
2120 ww+++- M7Y - Mode 7 Center Y
---xxxxx xxxxxxxx

Note that these are "write twice" registers, like the other M7*
registers. See above for the write semantics. The value is 13 bit
two's-complement signed.

The matrix transformation formula is:

[ X ] [ A B ] [ SX + M7HOFS - CX ] [ CX ]
[ ] = [ ] * [ ] + [ ]
[ Y ] [ C D ] [ SY + M7VOFS - CY ] [ CY ]

Note: SX/SY are screen coordinates. X/Y are coordinates in the playing
field from which the pixel is taken. If $211a bit 7 is clear, the
result is then restricted to 0<=X<=1023 and 0<=Y<=1023. If $211a bits 6
and 7 are both set and X or Y is less than 0 or greater than 1023, use
the low 3 bits of each to choose the pixel from character 0.

The bit-accurate formula seems to be something along the lines of:
#define CLIP(a) (((a)&0x2000)?((a)|~0x3ff):((a)&0x3ff))

X[0,y] = ((A*CLIP(HOFS-CX))&~63)
+ ((B*y)&~63) + ((B*CLIP(VOFS-CY))&~63)
+ (CX<<8)
Y[0,y] = ((C*CLIP(HOFS-CX))&~63)
+ ((D*y)&~63) + ((D*CLIP(VOFS-CY))&~63)
+ (CY<<8)

X[x,y] = X[x-1,y] + A
Y[x,y] = Y[x-1,y] + C

(In all cases, X[] and Y[] are fixed point with 8 bits of fraction)

See the section "BACKGROUNDS" below for details.


2121 wb+++- CGADD - CGRAM Address
cccccccc

This sets the word address (i.e. color) which will be affected by $2122
and $213b.


2122 ww+++- CGDATA - CGRAM Data write
-bbbbbgg gggrrrrr

This writes to CGRAM, effectively setting the palette colors.

Accesses to CGRAM are handled just like accesses to the low table of
OAM, see $2104 for details.

Note that the color values are stored in BGR order.


2123 wb+++- W12SEL - Window Mask Settings for BG1 and BG2
2124 wb+++- W34SEL - Window Mask Settings for BG3 and BG4
2125 wb+++- WOBJSEL - Window Mask Settings for OBJ and Color Window
ABCDabcd

c = Enable window 1 for BG1/BG3/OBJ
a = Enable window 2 for BG1/BG3/OBJ
C/A = Enable window 1/2 for BG2/BG4/Color
When the bit is set, the corresponding window will affect the
corresponding background (subject to the settings of $212e/f).

d = Window 1 Inversion for BG1/BG3/OBJ
b = Window 2 Inversion for BG1/BG3/OBJ
D/B = Window 1/2 Inversion for BG2/BG4/Color
When the bit is set, "W" should be replaced by "~W" (not-W) in the
window combination formulae below.

See the section "WINDOWS" below for more details.


2126 wb+++- WH0 - Window 1 Left Position
2127 wb+++- WH1 - Window 1 Right Position
2128 wb+++- WH2 - Window 2 Left Position
2129 wb+++- WH3 - Window 2 Right Position
xxxxxxxx

These set the offset of the appropriate edge of the appropriate window.
Note that if the left edge is greater than the right edge, the window
is considered to have no range at all (and thus "W" always is false).
See the section "WINDOWS" below for more details.


212a wb+++- WBGLOG - Window mask logic for BGs
44332211
212b wb+++- WOBJLOG - Window mask logic for OBJs and Color Window
----ccoo

44/33/22/11/oo/cc = Mask logic for BG1/BG2/BG3/BG4/OBJ/Color
This specified the window combination method, using standard
boolean operators:
00 = OR
01 = AND
10 = XOR
11 = XNOR

Consider two variables, W1 and W2, which are true for pixels
between the appropriate left and right bounds as set in
$2126-$2129 and false otherwise. Then, you have the following
possibilities: (replace "W#" with "~W#", depending on the Inversion
settings of $2123-$2125)
Neither window enabled => nothing masked.
One window enabled => Either W1 or W2, as appropriate.
Both windows enabled => W1 op W2, where "op" is as above.
Where the function is true, the BG will be masked.

See the section "WINDOWS" below for more details.


212c wb+++- TM - Main Screen Designation
212d wb+++- TS - Subscreen Designation
---o4321

1/2/3/4/o = Enable BG1/BG2/BG3/BG4/OBJ for display
on the main (or sub) screen.

See the section "BACKGROUNDS" below for details.


212e wb+++- TMW - Window Mask Designation for the Main Screen
212f wb+++- TSW - Window Mask Designation for the Subscreen
---o4321

1/2/3/4/o = Enable window masking for BG1/BG2/BG3/BG4/OBJ on the
main (or sub) screen.

See the section "BACKGROUNDS" below for details.


2130 wb+++- CGWSEL - Color Addition Select
ccmm--sd

cc = Clip colors to black before math
00 => Never
01 => Outside Color Window only
10 => Inside Color Window only
11 => Always

mm = Prevent color math
00 => Never
01 => Outside Color Window only
10 => Inside Color Window only
11 => Always

s = Add subscreen (instead of fixed color)

d = Direct color mode for 256-color BGs

See the sections "BACKGROUNDS", "WINDOWS", and "RENDERING THE
SCREEN" below for details.


2131 wb+++- CGADSUB - Color math designation
shbo4321

s = Add/subtract select
0 => Add the colors
1 => Subtract the colors

h = Half color math. When set, the result of the color math is
divided by 2 (except when $2130 bit 1 is set and the fixed color is
used, or when color is cliped).

4/3/2/1/o/b = Enable color math on BG1/BG2/BG3/BG4/OBJ/Backdrop

See the sections "BACKGROUNDS", "WINDOWS", and "RENDERING THE
SCREEN" below for details.


2132 wb+++- COLDATA - Fixed Color Data
bgrccccc

b/g/r = Which color plane(s) to set the intensity for.
ccccc = Color intensity.

So basically, to set an orange you'd do something along the lines of:
LDA #$3f
STA $2132
LDA #$4f
STA $2132
LDA #$80
STA $2132

See the sections "BACKGROUNDS" and "WINDOWS" below for details.


2133 wb+++- SETINI - Screen Mode/Video Select
se--poIi

s = "External Sync". Used for superimposing "sfx" graphics, whatever
that means. Usually 0. Not much is known about this bit.
Interestingly, the SPPU1 chip has a pin named "EXTSYNC" (or
not-EXTSYNC, since it has a bar over it) which is tied to Vcc.

e = Mode 7 EXTBG ("Extra BG"). When this bit is set, you may enable
BG2 on Mode 7. BG2 uses the same tile and character data as BG1,
but interprets the high bit of the color data as a priority for the
pixel.

Various sources report additional effects for this bit, possibly
related to bit 7. For example, "Enable the Data Supplied From the
External Lsi.", whatever that means. Of course, maybe that's a
typo and it's supposed to apply to bit 7 instead.

p = Enable pseudo-hires mode. This creates a 512-pixel horizontal
resolution by taking pixels from the subscreen for the
even-numbered pixels (zero based) and from the main screen for the
odd-numbered pixels. Color math behaves just as with Mode 5/6
hires. The interlace bit still has no effect. Mosaic operates as
normal (not like Mode 5/6). The 'subscreen' pixel is clipped (by
windows) when the main-screen pixel to the LEFT is clipped, not
when the one to the RIGHT is clipped as you'd expect. What happens
with pixel column 0 is unknown.

Enabling this bit in Modes 5 or 6 has no effect.

o = Overscan mode. When set, 239 lines will be displayed instead of
the normal 224. This also means V-Blank will occur that
much later, and be shorter. All that happens is that extra lines
get added to the display, and it seems the TV will like to move
the display up 8 pixels. See below for more details.

I = OBJ Interlace. When set regardless of BG mode, the OBJ will be
interlaced (see bit 0 below), and thus will appear half-height.

Note that this only controls whether obj are drawn as normal or
not; the interlace signal is only output to the TV based on bit 0
below.

i = Screen interlace. When set in BG mode 5 (and probably 6), the
effective screen height will be 448 (or 478) pixles, rather than
224 (or 239). When set in any other mode, the screen will just get
a bit jumpy. However, toggling the tilemap each field would
simulate the increased screen height (much like pseudo-hires
simulares hires).

In hardware, setting this bit makes the SNES output a normal
interlace signal rather than always forcing one frame.

See the sections "BACKGROUNDS" and "SPRITES" below for details.

Overscan: The bit only matters at the very end of the frame, if you
change the setting on line 0xE0 before the normal NMI trigger point
then it's the same as if you had it on all frame. Note that this
affects both the NMI trigger point and when HDMA stops for the
frame.

If you turn the bit off at the very beginning of scanline X (for
0xE1<=X<=0xF0), NMI will occur on line X and the last HDMA transfer
will occur on line X-1. However, on my TV at least, the display will
remain in the normal no-overscan position for lines E1-EC, it will
move up only one pixel for line ED, and it will lose vertical sync
for lines EF-F4!

Turning the bit on, only line E1 gives any effect: NMI will occur on
line E2, although the last HDMA will still occur on line E0.
Anything else acts like you left the bit off the whole time. Note,
however, that if you wait too long after the beginning of the
scanline then you will get no effect.

Even if there is no visible effect, the overscan setting still
affects VRAM writes. In particular, executing "LDA #'-' / STA $2118
/ LDA r2133 / STA $2133 / LDA #'+' / STA $2118" during the E1-F0
period will write only + or only - to VRAM, depending on whether the
overscan bit was set to 0 or 1.


2134 r l+++? MPYL - Multiplication Result low byte
2135 r m+++? MPYM - Multiplication Result middle byte
2136 r h+++? MPYH - Multiplication Result high byte
xxxxxxxx xxxxxxxx xxxxxxxx

This is the 2's compliment product of the 16-bit value written to $211b
and the 8-bit value most recently written to $211c. There is supposedly
no important delay. It may not be operative during Mode 7 rendering.


2137 b++++ SLHV - Software Latch for H/V Counter
--------

When read, the H/V counter (as read from $213c and $213d) will be
latched to the current X and Y position if bit 7 of $4201 is set. The
data actually read is open bus.


2138 r w++?- OAMDATAREAD* - Data for OAM read
xxxxxxxx

OAM reads are straightforward: the current byte as set in $2102/3 and
incremented by reads from this register and writes to $2104 will be
returned. Note that writes to the lower table are not affected so
logically. See register $2104 and the section "SPRITES" below for
details.

Also, note that OAM address invalidation probably affects the address
read by this register as well.


2139 r l++?- VMDATALREAD* - VRAM Data Read low byte
213a r h++?- VMDATAHREAD* - VRAM Data Read high byte
xxxxxxxx xxxxxxxx

Simply, this reads data from VRAM. The address is incremented when
either $2139 or $213a is read, depending on the setting of bit 7 of
$2115.

Actually, the reading is more complex. When either of these registers
is read, the appropriate byte from a word-sized buffer is returned. A
word from VRAM is loaded into this buffer just *before* the VRAM
address is incremented. The actual data read and the amount of the
increment depend on the low 4 bits of $2115. The effect of this is
that a 'dummy read' is required after setting $2116-7 before you start
getting the actual data.

The interaction between these registers and $2118/9 is unknown.

See the sections "BACKGROUNDS" and "SPRITES" below for details.


213b r w++?- CGDATAREAD* - CGRAM Data read
-bbbbbgg gggrrrrr

This reads from CGRAM.

Accesses to CGRAM are handled just like accesses to the low table of
OAM, see $2138 for details.

Note that the color values are stored in BGR order. The '-' bit is PPU2
Open Bus.


213c r w++++ OPHCT - Horizontal Scanline Location
213d r w++++ OPVCT - Vertical Scanline Location
-------x xxxxxxxx

These values are latched by reading $2137 when bit 7 of $4201 is set,
or by clearing-and-setting bit 7 of $4201 either by writing $4201 or by
pin 6 of Controller Port 2 (the latch occurs on the 1->0 transition).

Note that the value read is only 9 bits: bits 1-7 of the high byte are
PPU2 Open Bus. Each register keeps seperate track of whether to
return the low or high byte. The high/low selector is reset to 'low'
when $213f is read (the selector is NOT reset when the counter is
latched).

H Counter values range from 0 to 339, with 22-277 being visible on the
screen. V Counter values range from 0 to 261 in NTSC mode (262 is
possible every other frame when interlace is active) and 0 to 311 in
PAL mode (312 in interlace?), with 1-224 (or 1-239(?) if overscan is
enabled) visible on the screen.


213e r b++++ STAT77 - PPU Status Flag and Version
trm-vvvv

t = Time Over Flag. If more than 34 sprite-tiles (e.g. a 16x16
sprite has 2 sprite-tiles) were encountered on a single line, this
flag will be set. The flag is reset at the end of V-Blank. See the
section "SPRITES" below for details.

r = Range Over Flag. If more than 32 sprites were encountered on a
single line, this flag will be set. The flag is reset at the end of
V-Blank. See the section "SPRITES" below for details.

Note that the above two flags are set whether or not OBJ are
actually enabled at the time.

m = "Master/slave mode select". Little is known about this bit.
Current theory is that it indicates the status of the "MASTER" pin
on the S-PPU1 chip, which in the normal SNES is always Gnd. We
always seem to read back 0 here.

vvvv = 5c77 chip version number. So far, we've only encountered version
1.

The '-' bit is PPU1 Open Bus.


213f r b++++ STAT78 - PPU Status Flag and Version
fl-pvvvv

f = Interlace Field. This will toggle every V-Blank.

l = External latch flag. When the PPU counters are latched, this
flag gets set. The flag is reset on read, but only when $4201 bit 7
is set.

p = NTSC/Pal Mode. If this is a PAL SNES, this bit will be set,
otherwise it will be clear.

vvvv = 5C78 chip version number. So far, we've encountered at least 2
and 3. Possibly 1 as well.

The '-' bit is PPU2 Open Bus.

Note: as a side effect of reading this register, the high/low byte
selector for $213c/d is reset to 'low'.


2140 rwb++++ APUIO0 - APU I/O register 0
2141 rwb++++ APUIO1 - APU I/O register 1
2142 rwb++++ APUIO2 - APU I/O register 2
2143 rwb++++ APUIO3 - APU I/O register 3
xxxxxxxx

These registers are used in communication with the SPC700. Note that
the value written here is not the value read back. Rather, the value
written shows up in the SPC700's registers $f4-7, and the values
written to those registers by the SPC700 are what you read here.

If the SPC700 writes the register during a read, the value read will
be the logical OR of the old and new values. The exact cycles during
which the 'read' actually occurs is not known, although a good guess
would be some portion of the final 3 master cycles of the 6-cycle
memory access.

Note that these registers are mirrored throughout the range
$2140-$217f.


2180 rwb++++ WMDATA - WRAM Data read/write
xxxxxxxx

This register reads to or writes from the WRAM address set in $2181-3.
The address is then incremented. The effect of mixed reads and writes
is unknown, but it is suspected that they are handled logically.

Note that attempting a DMA from WRAM to this register will not work,
WRAM will not be written. Attempting a DMA from this register to
WRAM will similarly not work, the value written is (initially) the Open
Bus value. In either case, the address in $2181-3 is not incremented.


2181 wl++++ WMADDL - WRAM Address low byte
2182 wm++++ WMADDM - WRAM Address middle byte
2183 wh++++ WMADDH - WRAM Address high bit
-------x xxxxxxxx xxxxxxxx

This is the address that will be read or written by accesses to $2180.
Note that WRAM is also mapped in the SNES memory space from $7E:0000 to
$7F:FFFF, and from $0000 to $1FFF in banks $00 through $3F and $80
through $BF.

Verious docs indicate that these registers may be read as well as
written. However, they are wrong. These registers are open bus.

DMA from WRAM to these registers has no effect. Otherwise, however, DMA
writes them as normal. This means you could use DMA mode 4 to $2180 and
a table in ROM to write any sequence of RAM addresses.

The value does not wrap at page boundaries on increment.


4016 rwb++++ JOYSER0 - NES-style Joypad Access Port 1
Rd: ------ca
Wr: -------l
4017 r?b++++ JOYSER1 - NES-style Joypad Access Port 2
---111db

These registers basically have a direct connection to the controller
ports on the front of the SNES.

l = Writing this bit controlls the Latch line of both controller
ports. When 1 is set, the Latch goes high (or is it low? At any
rate, whichever one makes the pads latch their state). When
cleared, the Latch goes the other way.

a/b = These bits return the state of the Data1 line.
c/d = These bits return the state of the Data2 line.
Reading $4016 drives the Clock line of Controller Port 1 low.
The SNES then reads the Data1 and Data2 lines, and Clock is set
back to high. $4017 does the same for Port 2.

Note the 1-bits in $4017: the CPU chip has pins for these bits, but
these pins are tied to Gnd and thus always 1.

Data for normal joypads is returned in the order: B, Y, Select,
Start, Up, Down, Left, Right, A, X, L, R, 0, 0, 0, 0, then ones
until latched again.

Note that Auto-Joypad Read (see register $4200) will effectively write
1 then 0 to bit 'l', then read 16 times from both $4016 and $4017. The
'a' bits will end up in $4218/9, with the first bit read (e.g. the B
button) in bit 15 of the word. Similarly, the 'b' bits end up in
$421a/b, the 'c' bits in $42c/d, and the 'd' bits in $421e/f. Any
further bits the device may return may be read from $4016/$4017 as
normal.

The effect of reading these during auto-joypad read is unknown.

See the section "CONTROLLERS" below for details.


4200 wb+++? NMITIMEN - Interrupt Enable Flags
n-yx---a

n = Enable NMI. If clear, NMI will not occur. If set, NMI will fire
just after the start of V-Blank.

NMI fires shortly after the V Counter reaches $E1 (or presumably
$F0 if overscan is enabled, see register $2133).

x/y = IRQ enable.
0/0 => No IRQ will occur
0/1 => An IRQ will occur sometime just after the V Counter reaches
the value set in $4209/a.
1/0 => An IRQ will occur sometime just after the H Counter reaches
the value set in $4207/8.
1/1 => An IRQ will occur sometime just after the H Counter reaches
the value set in $4207/8 when V Counter equals the value set
in $4209/a.

a = Auto-Joypad Read Enable. When set, the registers $4218-$421f
will be updated at about V Counter = $E3 (or presumably $F2).

Some games try to read this register. However, they work only because
open bus behavior gives them values they expect.

This register is initialized to $00 on power on or reset.


4201 wb++++ WRIO - Programmable I/O port (out-port)
abxxxxxx

This is basically just an 8-bit I/O Port. 'b' is connected to pin 6 of
Controller Port 1. 'a' is connected to pin 6 of Controller Port 2, and
to the PPU Latch line. Thus, writing a 0 then a 1 to bit 'a' will latch
the H and V Counters much like reading $2137 (the latch happens on the
transition to 0). When bit 'a' is 0, no latching can occur.

Any other effects of this register are unknown. See $4213 for the
I half of the I/O Port.

Note that the IO Port is initialized as if this register were written
with all 1-bits at power up, unchanged on reset(?).


4202 wb++++ WRMPYA - Multiplicand A
4203 wb++++ WRMPYB - Multiplicand B
mmmmmmmm

Write $4202, then $4203. 8 "machine cycles" (probably 48 master cycles)
after $4203 is set, the product may be read from $4216/7. $4202 will
not be altered by this process, thus a new value may be written to
$4203 to perform another multiplication without resetting $4202.

The multiplication is unsigned.

$4202 holds the value $ff on power on and is unchanged on reset.


4204 wl++++ WRDIVL - Dividend C low byte
4205 wh++++ WRDIVH - Dividend C high byte
dddddddd dddddddd
4206 wb++++ WRDIVB - Divisor B
bbbbbbbb

Write $4204/5, then $4206. 16 "machine cycles" (probably 96 master
cycles) after $4206 is set, the quotient may be read from $4214/5, and
the remainder from $4216/7. Presumably, $4204/5 are not altered by this
process, much like $4202.

The division is unsigned. Division by 0 gives a quotient of $FFFF and a
remainder of C.

WRDIV holds the value $ffff on power on and is unchanged on reset.


4207 wl++++ HTIMEL - H Timer low byte
4208 wh++++ HTIMEH - H Timer high byte
-------h hhhhhhhh

If bit 4 of $4200 is set and bit 5 is clear, an IRQ will fire every
scanline when the H Counter reaches the value set here. If bits 4 and 5
are both set, the IRQ will fire only when the V Counter equals the
value set in $4209/a.

Note that the H Counter ranges from 0 to 339, thus greater values will
result in no IRQ firing.

HTIME is initialized to $1ff on power on, unchanged on reset.


4209 wl++++ VTIMEL - V Timer low byte
420a wh++++ VTIMEH - V Timer high byte
-------v vvvvvvvv

If bit 5 of $4200 is set and bit 4 is clear, an IRQ will fire just
after the V Counter reaches the value set here. If bits 4 and 5 are
both set, the IRQ will fire instead when the V Counter equals the value
set here and the H Counter reaches the value set in $4207/8.

Note that the V Counter ranges from 0 to 261 in NTSC mode (262 is
possible every other frame whan interlace is active) and 0 to 311 in
PAL mode (312 in interlace?), thus greater values will result in no IRQ
firing.

VTIME is initialized to $1ff on power on, unchanged on reset.


420b wb++++ MDMAEN - DMA Enable
76543210

7/6/5/4/3/2/1/0 = Enable the selected DMA channels. The CPU will be
paused until all DMAs complete. DMAs will be executed in order from
0 to 7 (?).

See registers $43x0-$43xA for more details.

If HDMA (init or transfer) occurs while a DMA is in progress, the DMA
will be paused for the duration. If the HDMA happens to involve the
current DMA channel, the DMA will be immediately terminated and the
HDMA will progress using the then-current values of the registers.
Other DMA channels will be unaffected.

This register is initialized to $00 on power on or reset.

See the section "DMA AND HDMA" below for more information.


420c wb++++ HDMAEN - HDMA Enable
76543210

7/6/5/4/3/2/1/0 = Enable the selected HDMA channels. HDMAs will be
executed in order from 0 to 7 (?).

See registers $43x0-$43xA for more details.

If HDMA (init or transfer) occurs while a DMA is in progress, the DMA
will be paused for the duration. If the HDMA happens to involve the
current DMA channel, the DMA will be immediately terminated and the
HDMA will progress using the then-current values of the registers.
Other DMA channels will be unaffected.

Note that enabling a channel mid-frame will begin HDMA at the next HDMA
point. However, the HDMA register initialization only occurs before the
HDMA point on scanline 0, so those registers will have to be
initialized by hand before enabling HDMA. A channel that has already
terminated for the frame cannot be restarted in this manner.

Writing 0 to a bit will pause an ongoing HDMA; the transfer may be
continued by writing 1 to the bit.

This register is initialized to $00 on power on or reset.

See the section "DMA AND HDMA" below for more information.


420d wb++++ MEMSEL - ROM Access Speed
-------f

f = FastROM select. The SNES uses a master clock running at
about 21.477 MHz (current theory is 1.89e9/88 Hz). By default, the
SNES takes 8 master cycles for each ROM access. If this bit is set
and ROM is accessed via banks $80-$FF, only 6 master cycles will be
used.

This register is initialized to $00 on power on (or reset?).

See my memory map and timing doc (memmap.txt) for more details.


4210 r b++++ RDNMI - NMI Flag and 5A22 Version
n---vvvv

n = NMI Flag. This bit is set at the start of V-Blank (at the
moment, we suspect when H-Counter is somewhere between $28 and
$4E), and cleared on read or at the end of V-Blank. Supposedly, it
is required that this register be read during NMI.

Note that this bit is not affected by bit 7 of $4200.

vvvv = 5A22 chip version number. So far, we've encountered at least 2,
maybe 1 as well.

NMI is cleared on power on or reset.

The '-' bits are open bus.


4211 r b++++ TIMEUP - IRQ Flag
i-------

i = IRQ Flag. This bit is set just after an IRQ fires (at the
moment, it seems to have the same delay as the NMI Flag of $4210
has following NMI), and is cleared on read or write. Supposedly, it
is required that this register be read during the IRQ handler. If
this really is the case, then I suspect that that read is what
actually clears the CPU's IRQ line.

This register is marked read/write in another doc, with no explanation.

IRQ is cleared on power on or reset.

The '-' bits are open bus.


4212 r b++++ HVBJOY - PPU Status
vh-----a

v = V-Blank Flag. If we're currently in V-Blank, this flag is set,
otherwise it is clear. The setting seems to occur at H Counter
about $16-$17 when V Counter is $E1, and the clearing at about $1E
with V Counter 0.

h = H-Blank Flag. If we're currently in H-Blank, this flag is set,
otherwise it is clear. The setting seems to occur at H Counter
about $121-$122, and the clearing at about $12-$18.

a = Auto-Joypad Status. This is set while Auto-Joypad Read is in
progress, and cleared when complete. It typically turns on at
the start of V-Blank, and completes 3 scanlines later.

This register is marked read/write in another doc, with no explanation.


4213 r b++++ RDIO - Programmable I/O port (in-port)
abxxxxxx

Reading this register reads data from the I/O Port. The way the
I/O Port works, any bit set to 0 in $4201 will be 0 here. Any bit
set to 1 in $4201 may be 1 or 0 here, depending on whether any other
device connected to the I/O Port has set a 0 to that bit.

Bit 'b' is connected to pin 6 of Controller Port 1. Bit 'a' is
connected to pin 6 of Controller Port 2, and to the PPU Latch line.

See register $4201 for the O side of the I/O Port.


4214 r l++++ RDDIVL - Quotient of Divide Result low byte
4215 r h++++ RDDIVH - Quotient of Divide Result high byte
qqqqqqqq qqqqqqqq

Write $4204/5, then $4206. 16 "machine cycles" (probably 96 master
cycles) after $4206 is set, the quotient may be read from these
registers, and the remainder from $4216/7.

The division is unsigned.


4216 r l++++ RDMPYL - Multiplication Product or Divide Remainder low byte
4217 r h++++ RDMPYH - Multiplication Product or Divide Remainder high byte
xxxxxxxx xxxxxxxx

Write $4202, then $4203. 8 "machine cycles" (probably 48 master cycles)
after $4203 is set, the product may be read from these registers.

Write $4204/5, then $4206. 16 "machine cycles" (probably 96 master
cycles) after $4206 is set, the quotient may be read from $4214/5, and
the remainder from these registers.

The multiplication and division are both unsigned.


4218 r l++++ JOY1L - Controller Port 1 Data1 Register low byte
4219 r h++++ JOY1H - Controller Port 1 Data1 Register high byte
421a r l++++ JOY2L - Controller Port 2 Data1 Register low byte
421b r h++++ JOY2H - Controller Port 2 Data1 Register high byte
421c r l++++ JOY3L - Controller Port 1 Data2 Register low byte
421d r h++++ JOY3H - Controller Port 1 Data2 Register high byte
421e r l++++ JOY4L - Controller Port 2 Data2 Register low byte
421f r h++++ JOY4H - Controller Port 2 Data2 Register high byte
byetUDLR axlr0000

The bitmap above only applies for joypads, obviously. More
generically, Auto Joypad Read effectively sets 1 then 0 to $4016,
then reads $4016/7 16 times to get the bits for these registers.

a/b/x/y/l/r/e/t = A/B/X/Y/L/R/Select/Start button status.

U/D/L/R = Up/Down/Left/Right control pad status. Note that only one of
L/R and only one of U/D may be set, due to the pad hardware.

These registers are only updated when the Auto-Joypad Read bit (bit 0)
of $4200 is set. They are being updated while the Auto-Joypad Status
bit (bit 0) of $4212 is set. Reading during this time will return
incorrect values.

See the section "CONTROLLERS" below for details.


43x0 rwb++++ DMAPx - DMA Control for Channel x (x=0-7)
da-ifttt

d = Transfer Direction. When clear, data will be read from the CPU
memory and written to the PPU register. When set, vice versa.

Contrary to previous belief, this bit DOES affect HDMA! Indirect
mode is more useful, it will read the table as normal and write
from Bus B to the Bus A address specified. Direct mode will work as
expected though, it will read counts from the table and try to
write the data values into the table.

a = HDMA Addressing Mode. When clear, the HDMA table contains the
data to transfar. When set, the HDMA table contains pointers to the
data. This bit does not affect DMA.

i = DMA Address Increment. When clear, the DMA address will be
incremented for each byte. When set, the DMA address will be
decremented. This bit does not affect HDMA.

f = DMA Fixed Transfer. When set, the DMA address will not be
adjusted. When clear, the address will be adjusted as specified by
bit 4. This bit does not affect HDMA.

ttt = Transfer Mode.
000 => 1 register write once (1 byte: p )
001 => 2 registers write once (2 bytes: p, p+1 )
010 => 1 register write twice (2 bytes: p, p )
011 => 2 registers write twice each (4 bytes: p, p, p+1, p+1)
100 => 4 registers write once (4 bytes: p, p+1, p+2, p+3)
101 => 2 registers write twice alternate (4 bytes: p, p+1, p, p+1)
110 => 1 register write twice (2 bytes: p, p )
111 => 2 registers write twice each (4 bytes: p, p, p+1, p+1)

The effect of writing this register during HDMA to the associated
channel is unknown. Most likely, the change takes effect for the
next HDMA transfer.

This register is set to $ff on power on, and is unchanged on reset.

See the section "DMA AND HDMA" below for more information.


43x1 rwb++++ BBADx - DMA Destination Register for Channel x (x=0-7)
pppppppp

This specifies the Bus B address to access. Considering the standard
CPU memory space, this specifies which address $00:2100-$00:21ff to
access, with two- and four-register modes wrapping $21ff->$2100, not
$2200.

The effect of writing this register during HDMA to the associated
channel is unknown. Most likely, the change takes effect for the
next transfer.

This register is set to $ff on power on, and is unchanged on reset.

See the section "DMA AND HDMA" below for more information.


43x2 rwl++++ A1TxL - DMA Source Address for Channel x low byte (x=0-7)
43x3 rwh++++ A1TxH - DMA Source Address for Channel x high byte (x=0-7)
43x4 rwb++++ A1Bx - DMA Source Address for Channel x bank byte (x=0-7)
bbbbbbbb hhhhhhhh llllllll

This specifies the starting Address Bus A address for the DMA transfer,
or the beginning of the HDMA table for HDMA transfers. Note that Bus A
does not access the Bus B registers, so pointing this address at say
$00:2100 results in open bus.

The effect of writing this register during HDMA to the associated
channel is unknown. However, current theory is that only $43x4 will
affect the transfer. The changes will take effect at the next HDMA
init.

During DMA, $43x2/3 will be incremented or decremented as specified by
$43x0. However $43x4 will NOT be adjusted. These registers will not be
affected by HDMA.

This register is set to $ff on power on, and is unchanged on reset.

See the section "DMA AND HDMA" below for more information.


43x5 rwl++++ DASxL - DMA Size/HDMA Indirect Address low byte (x=0-7)
43x6 rwh++++ DASxH - DMA Size/HDMA Indirect Address high byte (x=0-7)
43x7 rwb++++ DASBx - HDMA Indirect Address bank byte (x=0-7)
bbbbbbbb hhhhhhhh llllllll

For DMA, $43x5/6 indicate the number of bytes to transfer. Note that
this is a strict limit: if this is set to 1 then only 1 byte will be
written, even if the transfer mode specifies 2 or 4 registers (and if
this is 5, all 4 registers would be written once, then the first only
would be written a second time). Note, however, that writing $0000 to
this register actually results in a transfer of $10000 bytes, not 0.

$43x5/6 are decremented during DMA, and thus typically end up set to 0
when DMA is complete.

For HDMA, $43x7 specifies the bank for indirect addressing mode. The
indirect address is copied into $43x5/6 and incremented appropriately.
For direct HDMA, these registers are not used or altered.

Writes to $43x7 during indirect HDMA will take effect for the next
transfer. Writes to $43x5/6 during indirect HDMA will also take effect
for the next HDMA transfer, however this is only noticable during
repeat mode (for normal mode, a new indirect address will be read from
the table before the transfer). For a direct transfer, presumably
nothing will happen.

This register is set to $ff on power on, and is unchanged on reset.

See the section "DMA AND HDMA" below for more information.


43x8 rwl++++ A2AxL - HDMA Table Address low byte (x=0-7)
43x9 rwh++++ A2AxH - HDMA Table Address high byte (x=0-7)
aaaaaaaa aaaaaaaa

At the beginning of the frame $43x2/3 are copied into this register for
all active HDMA channels, and then this register is updated as the
table is read. Thus, if a game wishes to start HDMA mid-frame (or
change tables mid-frame), this register must be written. Writing this
register mid-frame changes the table address for the next scanline.

This register is not used for DMA.

This register is set to $ff on power on, and is unchanged on reset.

See the section "DMA AND HDMA" below for more information.


43xa rwb++++ NLTRx - HDMA Line Counter (x=0-7)
rccccccc

r = Repeat Select. When set, the HDMA transfer will be performed
every line, rather than only when this register is loaded from the
table. However, this byte (and the indirect HDMA address) will only
be reloaded from the table when the counter reaches 0.

ccccccc = Line count. This is decremented every scanline. When it
reaches 0, a byte is read from the HDMA table into this register
(and the indirect HDMA address is read into $43x5/6 if applicable).

One oddity: the register is decremeted before being checked for r
status or c==0. Thus, setting a value of $80 is really "128 lines with
no repeat" rather than "0 lines with repeat". Similarly, a value of $00
will be "128 lines with repeat" when it doesn't mean "terminate the
channel".

This register is initialized at the end of V-Blank for every active
HDMA channel. Note that if a game wishes to begin HDMA during the
frame, it will most likely have to initalize this register. Writing
this mid-transfer will similarly change the count and repeat to take
effect next scanline. Remember though that 'repeat' won't take effect
until after the next transfer period.

This register is set to $ff on power on, and is unchanged on reset.

See the section "DMA AND HDMA" below for more information.

43xb rwb++++ ????x - Unknown (x=0-7)
43xf rwb++++ ????x - Unknown (x=0-7)
????????

The effects of these registers (if any) are unknown. $43xf and $43xb
are really aliases for the same register.

This register is set to $ff on power on, and is unchanged on reset.

Sprites

The SNES has 128 independant sprites. The sprite definitions are stored in Object Attribute Memory, or OAM.

OAM

OAM consists of 544 bytes, organized into a low table of 512 bytes and a high table of 32 bytes. Both tables are made up of 128 records. OAM is accessed by setting the word address in register $2102, the "table select" in bit 0 of $2103, then writing to $2104 or reading from $2138. Since the high table is only 32 bytes long, only the low 4 bits of $2102 are significant for indexing this table.

The internal OAM address is invalidated during the rendering of a scanline; this invalidation is deterministic, but we do not know how or when the value is determined. Current theory is that it is invalidated more-or-less continuously and has something to do with the current OAM address and possibly which sprites are on the current scanline. The internal OAM address is reloaded from $2102/3 at the beginning of V-Blank, if this occurs outside of a force-blank period. The reload also occurs on a 1->0 transition of $2100.7.

Each read/write increments the address by one byte (the internal address has 10 bits, with bit 9 selecting the table and bits 0-8 indexing). Reads simply read the current byte. Writes to the low table go into a word-sized buffer, which is written to the appropriate word of OAM when the high byte of the word is written. Thus, if alternating reads and writes occur such that the high byte of the word is always read instead of written, none of the writes will actually affect OAM. If the alternation happens such that the writes always occur to the high byte, not only the high bytes but whatever garbage is left in the low byte will be written as well!

Pictorally: Start OAM filled with all zeros. Write 1, read, read, Write 2, read, write 3 => OAM is 00 00 01 02 01 03, rather than 01 00 00 02 00 03 as you might expect.

Writes to the high table, on the other hand, work exactly as expected.

The record format for the low table is 4 bytes: 
byte OBJ*4+0: xxxxxxxx
byte OBJ*4+1: yyyyyyyy
byte OBJ*4+2: cccccccc
byte OBJ*4+3: vhoopppN

The record format for the high table is 2 bits:
bit 0/2/4/6 of byte OBJ/4: X
bit 1/3/5/7 of byte OBJ/4: s

The values are:
Xxxxxxxxx = X position of the sprite. Basically, consider this signed but see
below.
yyyyyyyy = Y position of the sprite. Values 0-239 are on-screen. -63 through
-1 are "off the top", so the bottom part of the sprite comes in at the
top of the screen. Note that this implies a really big sprite can go off
the bottom and come back in the top.
cccccccc = First tile of the sprite. See below for the calculation of the
VRAM address. Note that this could also be considered as 'rrrrcccc'
specifying the row and column of the tile in the 16x16 character table.
N = Name table of the sprite. See below for the calculation of the
VRAM address.
ppp = Palette of the sprite. The first palette index is 128+ppp*16.
oo = Sprite priority. See below for details.
h/v = Horizontal/Veritcal flip flags. Note this flips the whole sprite,
not just the individual tiles. However, the rectangular sprites are
flipped vertically as if they were two square sprites (i.e. rows
"01234567" flip to "32107654", not "76543210").
s = Sprite size flag. See below for details.

The sprite size is controlled by bits 5-7 of $2101, and the Size bit of OAM. $2101 determines the two possible sizes for all sprites. If the OAM Size flag is 0, the sprite uses the smaller size, otherwise it uses the larger size.

Palettes

There are 8 16-color palettes available to sprites, starting at CGRAM index 128. Thus, the palette number 'ppp' in OAM indicates that colors 128+ppp*16 through 128+ppp*16+15 are available to this sprite. However, the first of these is always considered transparent, to allow for non-rectangular shaped sprites.

Only sprites with palettes 4-7 participate in color math.

Character table in VRAM

Sprites have two 16x16 tile character tables in VRAM. Wrapping on these works much like for BG tilemaps: tile 0 is to the right of tile $0F and below tile $F0, tile $10 is below tile 0 and to the right of tile $1F, tile $FF is to the left of tile $F0 and above tile $0F, and so on. Which character table a sprite uses is determined by the N bit in OAM. So if you specify Tile=$ff, your 16x16 sprite is made of tiles $ff, $f0, $0f, and $00.

The first table is at the address specified by the Name Base bits of $2101, and the offset of the second is determined by the Name bits of $2101. The word address in VRAM of a sprite's first tile may be calculated as:

((Base<<13) + (cccccccc<<4) + (N ? ((Name+1)<<12) : 0)) & 0x7fff

See the section "BACKGROUNDS" below for details on the format of the character data.

Sprite Priority

There are two 'priority' concepts applicalbe to sprites. First, there are the priority bits in OAM, which control the priority of the sprites relative to the BGs. See the section "BACKGROUNDS" for more details on this.

The second is the priority with relation to the other sprites. This is completely controlled by the sprite's index and the priority rotation setting.

Priority rotation is set by bit 7 of $2103. If the bit is unset, Sprite 0 is always the first sprite. Otherwise, take the current internal OAM word address (not affected by OAM Address Invalidation) and give priority to the sprite number (OAMAddr&0xFE)>>1. So if you set $2102/3 to $104, then write 4 bytes, sprite 3 will have priority for the next frame. However, OAM Address Reset will reset the internal OAM address to word $104, so sprite 2 will have priority for subsequent frames.

There is one major oddity: if you set $2102/3=A, then write 4n+2*(A&1)+1 bytes (e.g. so the next byte written would go to the last byte in the 4-byte sprite record), sprite ((OAMAddr>>1)+Y)&0x7F has priority (where Y is the current line as addressed by sprites). Thus, if you put all 128 8x8 sprites at Y=63, write $8000 to $2102/3, then read 3 bytes from $2138, you will see sprites 63-70 having priority on successive scanlines.

FirstSprite ends up on top of all other sprites, regardless of the priority bits in OAM. FirstSprite+1 is on top of FirstSprite+2 is on top of FirstSprite+3 and so on until FirstSprite+127 (wrapping of course from sprite 127 to sprite 0). Note that only the priority of the topmost sprite is considered relative to the backgrounds. Thus, if FirstSprite+3 and FirstSprite+4 are identical except FirstSprite+3 has priority 0 and FirstSprite+4 has priority 3, they will both be hidden by any backgrounds that hide priority 0 sprites. This may seem counterintuitive, since FirstSprite+4 would normally go in front of these BGs, but many games depend on this behavior.

Drawing the Sprites

As with everything else on the SNES, sprites are drawn per-scanline. The process is basically as follows:
0) If any OBJ is at X=256 (or X=-256, same difference), consider it as being at X=0 when considering Range and Time. Note that this doesn't mean you actually draw it at X=0.1) Range: Starting with the FirstSprite, determine the first 32 sprites on this scanline.Only those sprites with -size < X < 256 are considered in Range. If there are more than 32 sprites on the scanline, set bit 6 of register $213e.2) Time: Starting with the last sprite in Range, load up to 34 8x8 tiles (from left-to-right, after flipping). If there are more than 34 tiles in Range, set bit 7 of $213e.Only those tiles with -8 < X < 256 are counted.3) Associate with each tile in Range and Time its true X position (256/-256 should not be set to 0), palette, and priority for drawing.See the section "RENDERING THE SCREEN" below for details.

Backgrounds

BG Modes

The SNES has 7 background modes, two of which have major variations. The modes are selected by bits 0-2 of register $2105. The variation of Mode 1 is selected by bit 3 of $2105, and the variation of Mode 7 is selected by bit 6 of $2133.

Mode    # Colors for BG 
1 2 3 4
======---=---=---=---=
0 4 4 4 4
1 16 16 4 -
2 16 16 - -
3 256 16 - -
4 256 4 - -
5 16 4 - -
6 16 - - -
7 256 - - -
7EXTBG 256 128 - -

In all modes and for all BGs, color 0 in any palette is considered transparent.

Tile Maps and Character Maps

Each BG has two regions of VRAM associated with it: one for the tilemap, and one for the character data.

The tilemap address is selected by bits 2-7 of registers $2107-a, and the tilemap size is selected by bits 0-1 of that same register. All tilemaps are 32x32, bits 0-1 simply select the number of 32x32 tilemaps and how they're layed out in memory:

  00  32x32   AA 
AA
01 64x32 AB
AB
10 32x64 AA
BB
11 64x64 AB
CD

Starting at the tilemap address, the first $800 bytes are for tilemap A. Then come the $800 bytes for B, then C then D. Of course, if only A is required something else could be stuck in the empty space.

Each entry in the tilemap is 2 bytes, formatted as (high low): 
vhopppcc cccccccc

v/h = Vertical/Horizontal flip this tile.
o = Tile priority.
ppp = Tile palette. The number of entries in the palette depends on the Mode
and the BG.
cccccccccc = Tile number.

To find the tilemap word address for a particular tile (X and Y), you'd use a formula something like this:
(Addr<<9) + ((Y&0x1f)<<5) + (X&0x1f) + (SY ? ((Y&0x20)<<(SX ? 6 : 5)) : 0) + (SX ? ((X&0x20)<<5) : 0)The tile character data is stored at the address pointed to by registers $210b-c, starting at byte address:
(Base<<13) + (TileNumber * 8*NumBitplanes)Each tile is (normally) 8x8 pixels. The data is stored in bitplanes. Each row of the tile fills 1 byte, with the leftmost pixel being in bit 7. For 4-color tiles, bitplanes 0 and 1 are stored in the low and high bytes of a word, with 8 words making up the tile. For a 16-color tile, bitplanes 0 and 1 are stored as for a 4-color tile, followed by bitplanes 2 and 3 in the same format. A 256-color tile is stored in the same way as 2 4-color tiles.

If the appropriate bit of $2105 is set, each "tile" of the tilemap actually corresponds to a 16x16 pixel block consisting of Tile, Tile+1, Tile+16, and Tile+17. In this case, the 32x32 tile tilemap codes for a 512x512 pixel screen rather than a 256x256 pixel screen as normal. Thus, using both 16x16 tiles and the 64x64 tilemap each BG can be up to 1024x1024 pixels. There is no wrapping like there is for 16x16 sprites: if you specify Tile=$2ff, you'll get $2ff, $300, $30f, and $310 (as opposed to $2ff, $2f0, $20f, and $200 you might otherwise expect). $3ff goes to $000, of course. Flipping in this mode flips th whole 16x16 tile, not just the individual 8x8 tiles.

BG Scrolling

Of course, depending on the BG mode and the interlace setting, Modes 0-6 have an actual display of 256x224 or 256x239 pixels. The BG scroll registers $210d-$2114 control the offset of the displayed area within that possible 256x256 to 1024x1024 pixel BG.

The display can never fall outside the BG: if that would seem to be the case, simply wrap around back to 0 (or 'tile' the BG to fill the full 1024x1024, however you like to think of it).

The registers $210d-$2114 are all write-twice to set the 16-bit value. The way this works, the last write to any of these registers is stored in a buffer. When a new byte is written to any register, the current register value, the previous byte written to any of the 6 registers, and the new byte written are combined as follows:
For BGnHOFS: (NewByte<<8) | (PrevByte&~7) | ((CurrentValue>>8)&7)For BGnVOFS: (NewByte<<8) | PrevByteFor the most part, the details don't really matter as most games always write two bytes to one of these registers. However, some games write only one byte, or they do other odd things.

Thus, the tilemap entry for a particular X and Y position on the screen may be calculated as follows:
Size = 8 or 16 depending on the appropriate bit of $2105TileX = (X + BGnHOFS)/SizeTileY = (Y + BGnVOFS)/SizeLook up the tile at TileX and TileY as described above.Note that many games will set their vertical scroll values to -1 rather than 0. This is bacause the SNES loads OBJ data for each scanline during the previous scanline. The very first line, though, wouldn't have any OBJ data loaded! So the SNES doesn't actually output scanline 0, although it does everything to render it. These games want the first line of their tilemap to be the first line output, so they set their VOFS registers in this manner. Note that an interlace screen needs -2 rather than -1 to properly correct for the missing line 0 (and an emulator would need to add 2 instead of 1 to account for this).

Direct Color Mode

For the 256-color BGs of Modes 3, 4, and 7, $2130 bit 0 when set enables direct color mode. In this mode, instead of ignoring ppp and using the character data as the palette index, you treat the character data as expressing a color BBGGGRRR, and use the 3 bits of ppp as bgr to make the color Red=RRRr0, Green=GGGg0, Blue=BBb00In direct color mode you cannot have a black pixel, since any pixel with character data = 0 is still considered transparent. Use one of the almost-black colors instead (01, 08 or 09 are good choices).

Mode 0

In Mode 0, you have 4 BGs of 4 colors each. To calculate the starting palette entry for a particular tile, you calculate:

ppp*4 + (BG#-1)*32

The background priority is (from 'front' to 'back'):
Sprites with priority 3BG1 tiles with priority 1BG2 tiles with priority 1Sprites with priority 2BG1 tiles with priority 0BG2 tiles with priority 0Sprites with priority 1BG3 tiles with priority 1BG4 tiles with priority 1Sprites with priority 0BG3 tiles with priority 0BG4 tiles with priority 0

Mode 1

In Mode 1, you have 2 BGs of 16 colors and 1 BG of 4 colors. To calculate the starting palette entry, calculate:

ppp*ncolors

The background priority varies depending on the setting of bit 3 of $2105. The priority is (from 'front' to 'back'):
BG3 tiles with priority 1 if bit 3 of $2105 is setSprites with priority 3BG1 tiles with priority 1BG2 tiles with priority 1Sprites with priority 2BG1 tiles with priority 0BG2 tiles with priority 0Sprites with priority 1BG3 tiles with priority 1 if bit 3 of $2105 is clearSprites with priority 0BG3 tiles with priority 0

Mode 2

In Mode 2, you have 2 BGs of 16 colors each. To calculate the starting palette index, calculate:

ppp*16

The priority is (from 'front' to 'back'):
Sprites with priority 3BG1 tiles with priority 1Sprites with priority 2BG2 tiles with priority 1Sprites with priority 1BG1 tiles with priority 0Sprites with priority 0BG2 tiles with priority 0
Note the change from Modes 0 and 1.

Mode 2 is the first of the Offset-Per-Tile Modes. In this mode, the 'tile data' for BG3 actually encodes a (possible) replacement HOffset and/or VOffset value for each tile of BG1 and/or BG2.

Consider a visible scanline. Normally, you'd get the pixels something like this:

HOFS = X + BGnHOFSVOFS = Y + BGnVOFSPixel[X,Y] = GetPixel(GetTile(BGn, HOFS, VOFS), HOFS, VOFS)

With offset-per-tile, the formula is a little more complicated:
HOFS = X + BGnHOFSVOFS = Y + BGnVOFSValidBit = 0x2000 for BG1, or 0x4000 for BG2

  if (!IsFirst8x8Tile(BGn, HOFS)) { 
/* Hopefully these calculations are right... */
Hval = GetTile(BG3, (HOFS&7)|(((X-8)&~7)+(BG3HOFS&~7)), BG3VOFS)
Vval = GetTile(BG3, (HOFS&7)|(((X-8)&~7)+(BG3HOFS&~7)), BG3VOFS + 8)
if (Hval&ValidBit) HOFS = (HOFS&7) | ((X&~7) + (Hval&~7))
if (Vval&ValidBit) VOFS = Y + Vval
}
Pixel[X,Y] = GetPixel(Get8x8Tile(BGn, HOFS, VOFS), HOFS, VOFS)

In other words, number the visible tiles in BGn from 0-32, and the 'visible' tiles in BG3 the same way. BGn tile 0 is offset as normal, then for 1<=T<33 BGn tile T gets the offset data from BG3 tile T-1. It doesn't matter whether or not the tiles actually align in any way.

Note that the leftmost visible tile is done as normal in all cases (although as little as 1 pixel may be visible, and if that still bothers you then use a clip window to hide it), and the next tile uses the tilemap entry for what would be BG3's leftmost tile. Note also that the 'new' offset completely overrides the BGnVOFS register, but the lower 3 bits of the BGnHOFS offset are still used. And note that the current Y position on the screen does not affect which row of the BG3 tilemap to reference, it's as if Y were always 0.

On the other hand, note that even if BGn is 16x16 tiles, BG3 can specify the offset for each 8x8 subtile. And if BG3 is 16x16, the offsets will apply to all the corresponding 8x8 subtiles on BGn. Also note that if BG3 is 16x16, we may end up using the same tile for Hval and Vval.

Mode 3

In Mode 3, you have one 256-color BG and one 16-color BG. To calculate the starting palette index, calculate:

BG1: 0BG2: ppp*16

The priority is (from 'front' to 'back'):
Sprites with priority 3BG1 tiles with priority 1Sprites with priority 2BG2 tiles with priority 1Sprites with priority 1BG1 tiles with priority 0Sprites with priority 0BG2 tiles with priority 0Note that register $2130 may enable Direct Color Mode on BG1.

Mode 4

In Mode 4, you have one 256-color BG and one 4-color BG. To calculate the starting palette index, calculate:

BG1: 0BG2: ppp*4

The priority is (from 'front' to 'back'):Sprites with priority 3BG1 tiles with priority 1Sprites with priority 2BG2 tiles with priority 1Sprites with priority 1BG1 tiles with priority 0Sprites with priority 0BG2 tiles with priority 0Note that register $2130 may enable Direct Color Mode on BG1.

Mode 4 is the second of the Offset-Per-Tile Modes. It operates much like Mode 2, however the SNES doesn't have time to load two offset values. Instead, it does this:

    Val = GetTile(BG3, ...) 
if (Val&0x8000) {
Hval = 0
Vval = Val
} else {
Hval = Val
Vval = 0
}

Mode 5

In Mode 5, you have one 16-color BG and one 4-color BG. To calculate the starting palette index, calculate:

ppp*ncolors

The priority is (from 'front' to 'back'):
Sprites with priority 3BG1 tiles with priority 1Sprites with priority 2BG2 tiles with priority 1Sprites with priority 1BG1 tiles with priority 0Sprites with priority 0BG2 tiles with priority 0Mode 5 is rather different from the previous modes. Instead of using an 8/16 pixel wide tile as normal, it always takes a 16 pixel wide tile (the height may still be 8 or 16) and only uses half the pixels (zero-based, the even pixels for subscreen tiles and the odd pixels for mainscreen tiles). Then it forces pseudo-hires on to render a 512-pixel wide scanline. Also, if Interlace mode is on (see bit 0 of $2133), the screen is 448 or 478 half-lines high instead of 224 or 239. Either the odd half-lines or the even half-lines are drawn each frame, as indicated by bit 7 of $213f.

Note that this means you must set $212c and $212d to the same value to get the 'expected' display.

Mode 6

In Mode 6, you have only one 16-color BG. To calculate the starting palette index, calculate:

ppp*ncolors

The priority is (from 'front' to 'back'):
Sprites with priority 3BG1 tiles with priority 1Sprites with priority 2Sprites with priority 1BG1 tiles with priority 0Sprites with priority 0Mode 6 has the same oddities as Mode 5. In addition, it is an offset per tile mode! That part works just like as Mode 2. However, remember that Mode 6 always uses 8 pixel (16 half-pixel) wide tiles, this applies to BG3 as well as BG1. You can't apply the offset to an 8-half-pixel tile nor to a 16-pixel wide area (except by using two offset values for the two 8-pixel areas).

Mode 7

Mode 7 is extremely different from all the modes before. You have one BG of 256 colors. However, the tilemap and character map are laid out completely differently.

The tilemap and charactermap are interleaved, with the character data being in the high byte of each word and the tilemap data being in the low byte (note that in hardware, VRAM is set up such that odd bytes are in one RAM chip and even in another, and each RAM chip has a separate address bus. The Mode 7 renderer probably accesses the two chips independantly). The tilemap is 128x128 entries of one byte each, with that one byte being simply a character map index. The character data is stored packed pixel rather than bitplaned, with one pixel per byte. Thus, to calculate the tilemap entry byte address for an X and Y position in the playing field, you'd calculate: (((Y&~7)<<4) + (X>>3))<<1

To find the byte address of the pixel, you'd calculate:

(((TileData<<6) + ((Y&7)<<3) + (X&7))<<1) + 1

Note that bits 4-7 of $2105 are ignored, as are $2107-$210c. They can be considered to be always 0.
The next odd thing about Mode 7 is that you have full matrix transformation abilities. With creative use of HDMA, you can even change the matrix per scanline. See registers $211b-$2120 for details on the matrix transformation formula. The entire screen can be flipped with bits 0-1 of $211a.

And finally, the playing field can actually be made larger than the tilemap. If bit 7 of $211a is set, bit 6 of $211a controls what is seen filling the space surrounding the map.

The background priorities are:
Sprites with priority 3Sprites with priority 2Sprites with priority 1BG1Sprites with priority 0When bit 6 of $2133 is set, you get a related mode known as Mode 7 EXTBG. In this mode, you get a BG2 with 128 colors, which uses the same tilemap and character data as BG1 but interprets the high bit of the pixel as a priority bit.

The priority map is:
Sprites with priority 3Sprites with priority 2BG2 pixels with priority 1Sprites with priority 1BG1Sprites with priority 0BG2 pixels with priority 0Note that the BG1 pixels (if BG1 is enabled) will usually completely obscure the low-priority BG2 pixels.

BG2 uses the Mode 7 scrolling registers ($210d-e) rather than the 'normal' BG2 ones ($210f-10). Subscreen, pseudo-hires, math, and clip windows work as normal; keep in mind OBJ and that you can do things like enable BG1 on main and BG2 on sub if you so desire. Mosaic is somewhat weird, see the section on Mosaic below.

Note that BG1, being a 256-color BG, can do Direct Color mode (in this case, of course, there is no palette value so you're limited to 256 colors instead of 2048). BG2 does not do direct color mode, since it is only 7-bit.

Rendering the BGs

Rendering a BG is simple.

  1. Get your H and V offsets (either by reading the appropriate registers or by doing the offset-per-tile calculation).
  2. Use those to translate the screen X and Y into playing field X and Y- Note this is rather complicated for Mode 7
  3. Look up the tilemap for those coordinates4) Use that to find the character data
  4. If necessary, de-bitplane it and stick it in a buffer.See the section "RENDERING THE SCREEN" below for more details.

Unresolved Issues

  1. What happens to the very first pixel on the scanline in Hires Math?
  2. Various registers still need to know when writing to them is effective.

Windows

The masking windows are pretty simple. The windows can be used to mask off a portion of any BG on the scanline. With HDMA, they can be adjusted per scanline. They can be combined in various ways, per BG. Each can be used to select either the region of the BG to keep, or the region of the BG to hide, per BG. All that's left is to see the registers above and the section "RENDERING THE SCREEN" below for details.

The Color Window

The color window is rather different. The color window itself can be set to clip the colors of pixels to black (before math, so it's almost the same effect you'd get by setting all entries in the palette to black, then fixing them before you do subscreen addition--the only difference is that half math will not occur), and to prevent all color math effects from occurring. These can be applied never, always, inside the "clip" windows specified for the color window, or outside the "clip" window.

Bits 6-7 of register $2130 controls whether the pixel colors (and half-math) will be clipped inside the window, outside the window, never, or always. Bits 4-5 do the same for preventing color math.

Consider the main screen set up so BGs 1 and 2 are visible in an 8x8 checkerboard pattern, with all the BG1 pixels red and all the BG2 pixels blue. The subscreen is filled with a green BG, and color math is enabled on BG 1 only. You'll end up with a yellow and blue checkerboard. Turn on the color window to clip colors, and you'll get a green and black checkerboard since the subscreen is only added (to a black pixel) where BG1 would be visible. If you clip math instead, you'll get the same display you'd get with color math disabled on all BGs.

In hires modes, we use the previous main-screen pixel to determine whether the color window effect should be applied to a subscreen pixel. See "Color Math" below for details.

Rendering the screen

Mosaic

The mosaic filter is applied after the BG is rendered and scrolled but before it is clipped, combined with other BGs, pseudo-hiresed, or mathed. Each XxX block of pixels is replaced with the upper-leftmost pixel of the block. The 'blocks' are such that the upper-leftmost block is at the left edge of the screen at the scanline where $2106 was written (or the first visible scanline if it was not written this frame).

Modes 5/6 Hires work slightly differently: they use a 2XxX block of half-pixels. Similarly, Modes 5/6 interlaced use a 2Xx2X block of half-pixels. So if you set $2106 to $0F ("1x1" blocks), the even half-pixels will be expanded to cover the odd half-pixels. $1F would cover the next even-and-odd pixel over as well. An example: put a single red pixel at line #1 pixel #0 of Mode 5 BG1, and a single blue pixel in the same place on BG2. Enable BG1 on main and BG2 on sub, you'll see the blue pixel only. Set $2106=$03, and you'll suddenly see both the blue and red pixels. Set $2106=$13, and you'll see "BRBR" on two lines.

Mode 7's matrix transformations do not affect the mosaic block positions, so BG1 can be mosaiced about as normal. BG2 in EXTBG mode is weird, though: it uses bit 0 of $2106 to control "vertical mosaic" and bit 1 to control "horizontal mosaic". So if $2106 is $F1, BG2 will expand with 1x16 blocks. $F2 will give 16x1 blocks, and only $F3 will give the expected 16x16 blocks. Note that BG1 still uses bit 0 as usual, so you can have BG1 expanded with 16x16 blocks and the high-priority BG2 pixels expanded with 1x16 blocks on top of it. Or you could have BG1 rendered as normal, but with the high-priority pixels from BG2 expanded 16x1 on top of it.

Color Math

Each main-screen BG (and the color-0 backdrop, and the sprites (although sprites with palettes 0-3 never participate)) may be marked in register $2131 to participate in color math. If the visible pixel is from a layer/OBJ participating in color math, we perform one of 8 operations on the pixel, depending on $2130 bit 1 and $2131 bits 6-7.

  0 00: Add the fixed color. R, G, and B are added separately, and clipped to 
the max.
0 01: Add the fixed color, and divide the result by 2 before clipping (unless
the Color Window is clipping colors here).
0 10: Subtract the fixed color from the pixel. For example, if the pixel is
(31,31,0) and the fixed color is (0,16,16), the result is (31,15,0).
0 11: Subtract the fixed color, and divide the result by 2 (unless CW etc).
1 00: Add the corresopnding subscreen pixel, or the fixed color if it's the
subscreen backdrop.
1 01: Add the subscreen pixel and divide by 2 (unless CW etc), or add the
fixed color with no division.
1 10: Subtract the subscreen pixel/fixed color.
1 11: Subtract the subscreen pixel and divide by 2 (unless CW etc), or sub
the fixed color with no division.

In hires modes, color math is applied to the visible subscreen pixels as well. Choosing the math operation is simple: look at the previous main-screen pixel (i.e. if we're at pixel #6 on the 512-pixel screen (which is taken from pixel

  1. 4 on the subscreen), we look at pixel #5 (#3 on the main screen)). If no math

was applied to that pixel, don't math this subscreen pixel either. If the fixed color was added/subtracted, add/subtract the fixed color. And if a pixel from the subscreen was added/subtracted, add/subtract that main-screen pixel (the original value before math). What happens to the subscreen pixel at the left edge of the screen is unknown.

This is really important with color subtraction: normally, if you have a block of cyan (#00ffff) on main and a block of magenta (#ff00ff) on sub, subtraction would give a block of green. Hires math will give you a block of alternating green and red, which will probably appear yellow on your TV. If you've set $2131 bit 6 and this block is sitting alone in the middle of the backdrop, you'll have a bright line at the left edge where the fixed color was subtracted from the subscreen pixel and no 1/2 was applied (because the previous main pixel had the fixed color subtracted and no 1/2 applied).

Rendering the Screen

Note that this may be inaccurate.
1) Go down the priority list to find the first BG/OBJ layer that is enabled on main, not clipped, and has a non-transparent pixel here. You'll always bottom out on the backdrop (color 0) if not before.2) If the color window clips colors here, set the color of that pixel to 0.3) If color math is applicable and the color window doesn't clip math here, do math.Hires modes (BG modes 5 and 6 or any mode 0-4 with bit 3 of $2133 set) should process the visible subscreen pixels as described above.

Controllers

The SNES has 2 controller ports on the front of the unit, and an "expansion port" on the bottom (which AFAIK was only used by a few things released only in Japan). Little is known about the expansion port.

A number of peripherals could be plugged into the controller ports:

  • Joypads
  • The Multitap (aka MP5), into which up to 4 joypads may be plugged.
  • A Mouse, with 2 buttons.
  • The SuperScope, a bazooka-like light gun.
  • The Konami Justifiers, a normal style gun into which a second gun could be plugged.

There are probably others, these are just the ones I know anything about.

Generic

The controller ports of the SNES has 7 pins, laid out something like this:

   _________________ ____________ 
| | \
| (1) (2) (3) (4) | (5) (6) (7) |
|_________________|____________/

The pins are:
1: +5v (power)
2: Clock
3: Latch
4: Data1
5: Data2
6: IOBit
7: Ground

Latch is written trhough bit 0 of register $4016. Writing 1 to this bit results in Latch going to whatever state means 'latch' to a joypad.

Clock of Port 1 is connected to the 'read' signal of $4016, in that reading $4016 causes Clock to transition. Data1 and Data2 are then read, and Clock transitions back (at this point, the pad is expected to stick its next bits of data on Data1 and Data2). Clock of Port 2 is connected to $4017.

Data1 and Data2 are read through bits 0 and 1 (respectively) of $4016 and $4017 (for Ports 1 and 2, respectively). Thus, you must read both bits at once, you can't choose to read only Data1 and leave Data2 for later.

IOBit is connected to the I/O Port (which is accessed through registers $4201 and $4213). Port 1's IOBit is connected to bit 6 of the I/O Port, and Port 2's IOBit is connected to bit 7. Note that, since bit 7 of the I/O Port is connected to the PPU Counter Latch, anything plugged into Port 2 may latch the H and V Counters by setting IOBit to 0.

Auto Joypad Read, when enabled by bit 0 of $4200, effectively does the following (in pseudo-ASM):

  LDA $4212 
ORA #$01
STA $4212 ; pretend it's writable

LDA #$01
STA $4016
; There may be a delay here
STZ $4016

LDX #$0010
loop:
LDA $4016
REP #$20
LSR
ROL $4218
LSR
ROL $421C
SEP #$20

LDA $4017
REP #$20
LSR
ROL $421A
LSR
ROL $421E
SEP #$20

DEX
BNE loop

LDA $4212
AND #$7E
STA $4212 ; pretend it's writable again

"Open Port"

If nothing is plugged into a port (or the thing plugged in doesn't connect to the pin), the SNES will read zeros from Data1 and Data2.

Joypads

The joypads return 16 bits of data out Data1, then one bits until latched again. The data is:
byetUDLRaxlr0000b/y/a/x/l/r are the similarly named buttons. 'e' is select. 't' is start. U/D/L/R are the pad directions. Note that the standard joypad can only return either U or D set, and either L or R set. Some games will crash or exhibit other odd behavior if both U and D and/or both L and R are set.

Data2 is not even connected, nor is IOBit.

Mouse

The mouse returns 32 bits of data out Data1, and 1 bits thereafter. The data is:
00000000rlss0001 YyyyyyyyXxxxxxxxl/r are the two mouse buttons. 'ss' are the "speed bits", which are incremented mod 3 if Clock cycles while Latch is active. Y/X are the direction bits (set is up/left), and yyyyyyy/xxxxxxx are the distance traveled in the appropriate direction.

Supposedly, the 'speed bits' may not match the internal speed setting when the mouse first receives power. The speed setting controls the delta curve of the mouse, with 0 giving a flat curve and 2 giving the greatest delta response.

Data2 and IOBit are presumably not connected, but this is not known for sure.

SuperScope

The SuperScope returns 8 bits of data out Data1, and 1 bits thereafter. The data is:
fctp00on'f' is Fire, 'c' is Cursor, 't' is Turbo, 'p' is Pause, 'o' is Offscreen, and 'n' is Noise.

The SuperScope has two modes of operation: normal mode and turbo mode. The current mode is controlled by a switch on the unit, and is indicated by the 't' bit. Note however that the 't' bit is only updated when the Fire button is pressed (i.e. the 'f' bit is set). Thus, when you turn turbo on the 't' bit remains clear until you shoot, and similarly when turbo is deactivated the bit remains set until you fire.

In either mode, the Pause bit will be set for the first strobe after the pause button is pressed, and then will be clear for subsequent strobes until the button is pressed again. However, the pause button is ignored if either cursor or fire are down(?).

In either mode, the Cursor bit will be set while the Cursor button is pressed.

In normal mode, the Fire bit operates like Pause: it is on for only one strobe. In turbo mode, it remains set as long as the button is held down.

When Fire/Cursor are set, Offscreen will be set if the gun did not latch during the previous strobe and cleared otherwise (Offscreen is not altered when Fire/Cursor are both clear).

Noise is set if there is interference in the infrared transmission from the Scope to the receiver.

If the Fire button is being held when turbo mode is activated, the gun sets the Fire bit and begins latching. If the Fire button is being held when turbo mode is deactivated, the next poll will have Fire clear but the Turbo bit will not be updated until the next fire (i.e. FcTp => turbo off => fcTp, not fctp).

The PPU latch operates as follows: When Fire or Cursor is set, IOBit is set to 0 when the gun sees the TV's electron gun, and left a 1 otherwise. Thus, if the SNES also leaves it one (bit 7 of $4201), the PPU Counters will be latched at that point. This would also imply that bit 7 of $4213 will be 0 at the moment the SuperScope sees the electron gun.

Since the gun depends on the latching behavior of IOBit, it will only function properly when plugged into Port 2. If plugged into Port 1 instead, everything will work except that there will be no way to tell where on the screen the gun is pointing.

When creating graphics for the SuperScope, note that the color red is not detected. For best results, use colors with the blue component over 75% and/or the green component over 50%.

Data2 is presumably not connected, but this is not known for sure.

Justifiers

The Justifier returns 48 bits of data out Data1. Presumably it returns one bits after (if so, it really only returns 32 bits), but this is not known. The data is:
0000000000001110 01010101TtSsl000 1111111111111111T/t are the trigger states for guns 1 and 2. S/s are the start button states for guns 1 and 2. 'l' indicates which gun was connected to IOBit: 1 means gun 1, 0 means gun 2. Note that 'l' toggles even when gun 2 is not connected.

IOBit is used just like for the SuperScope. However, since two guns may be plugged into one port, which gun is actually connected to IOBit changes each time Latch cycles. Also note, the Justifier does not wait for the trigger to be pulled before attempting to latch, it will latch every time it sees the electron gun. Bit 6 of $213F may be used to determine if the Justifier was pointed at the screen or not.

Data2 is presumably not connected, but this is not known for sure.

MP5

The MP5 plugs into one Controller Port on the SNES (typically Port 2), and has 4 ports for controllers to be plugged into it (labeled 2 through 5). It also has an override switch which makes it pass through Pad 2 and ignore everything else.

If IOBit is 1, Clock is passed through to Pad 2 and Pad 3, Data1 is connected to Data1 on Pad 2, and Data2 is connected to Data1 on Pad 3. If IOBit is 0, Pads 4 and 5 are used instead of 2 and 3, respectively. In either case, Latch is passed through to all pads, and IOBit is presumably not passed through at all.

Note that Clock is only passed through to the pads that are actually being passed through. Thus, you can read the first two pads (or let Auto-Joypad Read do it), then toggle IOBit and read the other two pads manually. Most games requiring more than 3 players do exactly this.

Also note that there is nothing preventing the MP5 from functioning perfectly when plugged in to Port 1, except that the game must use bit 6 of $4201 instead of bit 7 to set IOBit and must use the Port 1 registers instead of the Port 2 registers. With 2 MP5 units, one could actually create an 8-player game!

When Latch is active, 1s will be read from Data2 and 0s from Data1. This is sometmies used to detect the presence of an MP5 unit. The override switch disables this behavior.

There are reports that the MP5 does not react immediately when IOBit is transitioned from 0 to 1. Thus, reading 2&3 then 4&5 will probably work better than vice versa.

DMA and HDMA

DMA, or "direct memory access" is found in a number of computer systems, not just the Super Nintendo. It's basically a way for a peripheral or coprocessor to read data directly from memory, instead of requiring the main CPU to do a number of reads and writes. This is typically faster, if only because it lets the system skip the opcode fetch-and-decode. In the SNES, the CPU is paused during DMA since the address busses are in use for the transfer.

HDMA is similar in concept, though rather different in execution: instead of transferring a block of memory all at once, it transfers a few bytes during the H-Blank period of each scanline. This is extremely helpful, as most PPU registers may only be changed during a frame (at least without glitching) during this narrow window.

The SNES has 8 channels (numbered 0-7) that can be used for either DMA or HDMA. HDMA takes priority over DMA if both are to occur at once, pausing all DMA and terminating a conflicting DMA immediately. Lower-numbered channels take priority over higher-numbered channels.

DMA

A DMA transfer has three main variables, and a number of setting bits. These are: (those marked '*' must be set up before starting DMA)

  • Direction (bit 7 of $43x0): Read from PPU or write to PPU?
  • Fixed (bit 3 of $43x0): Adjust Address?
  • Increment (bit 4 of $43x0): Direction to adjust Address?
  • Mode (bits 0-2 of $43x0): See below...
  • Port (register $43x1): If this is 'xx', the register accessed will be $21xx.
  • AAddress (registers $43x2-4): Any CPU address, just like you'd use with the Absolute Long addressing mode.
  • Count (registers $43x5-6): The number of bytes to transfer.

See register $43x0 for the correspondance between the Mode bits and the transfer mode. Note that One Register Write Once and One Register Write Twice end up being the exact same thing, and Two Registers Write Once and Two Registers Write Twice Alternale are the same, but that Two Registers Write Once and Two Registers Write Twice Each are different.

DMA transfers take 8 master cycles per byte transferred, no matter the FastROM setting. There is also an overhead of 8 master cycles per channel, and an overhead of 12-24 cycles for the whole transfer.

The basic process seems to be:

  1. Get byte and write it to the destination.
    • The DMA seems to take advantage of the SNES's two address busses with one shared data bus. AAddress is pushed out Bus A, Port is pushed out bus B, and the read/write signals are sent according to Direction. The bus marked read obligingly put data on the bus, while the bus marked write obligingly writes that value.
    • Thus, since the PPU/APU/WRAM registers are only accessible via Bus B, attempts to access them via AAddress will result in Open Bus accesses.
    • Attempts to access WRAM via both Bus A and Bus B (registers 2180-3) will fail, with the 2180-3 access being Open Bussed.
    • Also, DMA cannot access the $4300-$437f registers nor $420b nor $420c. Writes will have no effect, and reads will return Open Bus.

  2. Adjust AAddress.
    • If Fixed is set, do nothing. Else if Increment is set, subtract one, else add one.
    • Note that the bank byte is not modified.

  3. Decrement Count. If count is not zero, then go to step 1.
    • Thus, if Count is initially zero, it wraps to 65535 before being tested. So you end up transferring 65536 bytes.

Note that Count ($43x5-6) ends up always 0, unless a conflicting HDMA terminates the transfer early.

HDMA

HDMA has 4 flags and 5 variables. Again, those marked '*' are required before starting HDMA. In addition, those marked '+' are required if HDMA is to be started mid-frame.

* Addressing Mode (bit 6 of $43x0): If clear, Direct, else Indirect. 
* Transfer Mode (bits 0-2 of $43x0): See below...
* Port ($43x1): As for DMA.
* AAddress ($43x2-4): Pointer to the HDMA Table. Not really 'required' for
starting mid-frame, but unless you're going to stop it before the next
init...
- Indirect Address ($43x5-6): Used with Indirect Bank. See below...
* Indirect Bank ($43x7): Used with Indirect Address. See below...
+ Address ($43x8-9): See below...
+ Repeat (bit 7 of $43xA): Whether to write every scanline or not
+ Line Counter (bits 0-6 of $43xA): See below...
- DoTransfer: Used internally.

Modes are the same as for DMA. However, note that only one cycle through the mode is done per scanline, so One Register Write Once will write 1 byte per scanline, while One Register Write Twice will write two.

For each scanline during which HDMA is active (i.e. at least one channel is not paused and has not terminated yet for the frame), there are ~18 master cycles overhead. Each active channel incurs another 8 master cycles overhead (during which time $42xA is presumably loaded if necessary) for every scanline, whether or not a transfer actually occurs. If a new indirect address is required, 16 master cycles are taken to load it. Then 8 cycles per byte transferred are used. Thus, HDMA takes a maximum of 466 master cycles per scanline (if all 8 channels are active, require an indirect address load, and transfer 4 bytes).

The basic process has two sections. First, at the beginning of the frame (V=0 H=approx 6), for all active HDMA channels (see register $420c):

  1. Copy AAddress into Address.
  2. Load $43xA (Line Counter and Repeat) from the table. I believe $00 willterminate this channel immediately.
  3. Load Indirect Address, if necessary.
  4. Set DoTransfer to true. The CPU is paused during this time. Overhead is ~18 master cycles, plus 8 master cycles for each channel set for direct HDMA and 24 master cycles for each channel set for indirect HDMA.

If you are starting HDMA mid-frame, you must basically do the init process manually by setting $43x8-A, and $43x5-6 for indirect channels. Note though that there is no way to perform step 4, so no transfer will be done the first transfer period. Also, note that a channel that has already terminated for the frame cannot be restarted. XXX: Or does it automatically do Step 4 when you enable the channel?

Then, for each scanline from V=0 to V=$e0 (or V=$ef is overscan is enabled) at about H=$116:

  1. If DoTransfer is false, skip to step 3.
  2. For the number of bytes (1, 2, or 4) required for this Transfer Mode...
    • a. Read a byte from Address or Indirect Address, and increment.
    • b. Write the byte to Port, Port+1, Port+2, or Port+3, depending on the Transfer Mode and which byte we're on. - The same notes regarding DMA from PPU to PPU or RAM to RAM via $2180 apply here as well.

  3. Decrement $43xA.
  4. Set DoTransfer to the value of Repeat.
  5. If Line Counter is zero...
    • a. Read the next byte from Address into $43xA (thus, into both Line Counter and Repeat).
    • b. If Addressing Mode is Indirect, read two bytes from Address into Indirect Address (and increment Address by two bytes). - One oddity: if $43xA is 0 and this is the last active HDMA channel for this scanline, only load one byte for Address, and use the $00 for the low byte. So Address ends up incremented one less than otherwise expected, and one less CPU Cycle is used.
    • c. If $43xA is zero, terminate this HDMA channel for this frame. The bit in $420c is not cleared, though, so it may be automatically restarted next frame.
    • d. Set DoTransfer to true.

  6. Continue with Step 1 next scanline.

HDMA does not occur during V-Blank, as any writes it might perform are likely have no visible effect anyway. The start-of-frame processing then resets all active channels at the end of V-Blank. This allows updating of the HDMA registers during V-Blank without worrying about the transfer beginning immediately and scribbling on the PPU state.

Note how the above implicitly defines the format of the HDMA table. Explicitly, the format is a series of entries. Each entry begins with a line count and repeat flag. If repeat is false, there is one scanline worth of data following and the count is the number of scanlines to wait before processing the next entry. If it's true, the line count is the number of scanlines worth of data following. The data following is either a pointer to the data (for Indirect HDMA), or the data itself (for Direct HDMA).

Looking at the above, it's clear why Address, and Repeat/Line Counter must be initialized by hand when starting HDMA mid-frame: they're only automatically initialized at the start of the frame. Note how AAddress is not affected by HDMA, though Address and Repeat/Line Counter are.

History

In the beginning... Well, ok, somewhere in the middle is where I came in. In my beginning, there was Yoshi's register doc (the one with bananas) and snesmap.txt (the one missing all those appendices). Both good register docs, complete as far as bits go, but sorely lacking in the explanations. And there was the snes9x source, with more clear semantics but many errors. I began writing test ROMs for others to run, and discover the real behavior of the SNES. Soon, someone lent me a device to test the ROMs myself.

I discovered many things, and lamented the errors in the documentation available. Finally one day I decided to sit down and write out everything I had discovered. An early version of this was the result. Along the way, and continuing since, I've revised this document with new findings. I believe this is the most accurate SNES PPU/graphics document available today. Enjoy!

← previous
next →
loading
sending ...
New to Neperos ? Sign Up for free
download Neperos App from Google Play
install Neperos as PWA

Let's discover also

Recent Articles

Recent Comments

Neperos cookies
This website uses cookies to store your preferences and improve the service. Cookies authorization will allow me and / or my partners to process personal data such as browsing behaviour.

By pressing OK you agree to the Terms of Service and acknowledge the Privacy Policy

By pressing REJECT you will be able to continue to use Neperos (like read articles or write comments) but some important cookies will not be set. This may affect certain features and functions of the platform.
OK
REJECT