RSP Command queue. More...

Data Structures
struct	rspq_overlay_header_t
	The header of the overlay in DMEM. More...

struct	rspq_ctx_t
	RSP queue building context. More...

Macros
#define	rspq_append1(ptr, cmd, arg1)
	Smaller version of rspq_write that writes to an arbitrary pointer.

#define	rspq_append2(ptr, cmd, arg1, arg2)
	Smaller version of rspq_write that writes to an arbitrary pointer.

#define	rspq_append3(ptr, cmd, arg1, arg2, arg3)
	Smaller version of rspq_write that writes to an arbitrary pointer.

Functions
	DEFINE_RSP_UCODE (rsp_queue,.crash_handler=rspq_crash_handler,.assert_handler=rspq_assert_handler)

void	rspq_init (void)
	Initialize the RSPQ library.

void	rspq_close (void)
	Shut down the RSPQ library.

void *	rspq_overlay_get_state (rsp_ucode_t *overlay_ucode)
	Return a pointer to the overlay state (in RDRAM)

rsp_queue_t *	__rspq_get_state (void)
	Return a pointer to a copy of the current RSPQ state.

uint32_t	rspq_overlay_register (rsp_ucode_t *overlay_ucode)
	Register a rspq overlay into the RSP queue engine.

void	rspq_overlay_register_static (rsp_ucode_t *overlay_ucode, uint32_t overlay_id)
	Register an overlay into the RSP queue engine assigning a static ID to it.

void	rspq_overlay_unregister (uint32_t overlay_id)
	Unregister a ucode overlay from the RSP queue engine.

void	rspq_next_buffer (void)
	Switch to the next write buffer for the current RSP queue.

void	rspq_flush (void)
	Make sure that RSP starts executing up to the last written command.

void	rspq_highpri_begin (void)
	Start building a high-priority queue.

void	rspq_highpri_end (void)
	Finish building the high-priority queue and close it.

void	rspq_highpri_sync (void)
	Wait for the RSP to finish processing all high-priority queues.

void	rspq_block_begin (void)
	Begin creating a new block.

rspq_block_t *	rspq_block_end (void)
	Finish creating a block.

void	rspq_block_free (rspq_block_t *block)
	Free a block that is not needed any more.

void	rspq_block_run (rspq_block_t *block)
	Add to the RSP queue a command that runs a block.

void	rspq_block_run_rsp (int nesting_level)
	Notify that a RSP command is going to run a block.

void	rspq_noop ()
	Enqueue a no-op command in the queue.

rspq_syncpoint_t	rspq_syncpoint_new (void)
	Create a syncpoint in the queue.

bool	rspq_syncpoint_check (rspq_syncpoint_t sync_id)
	Check whether a syncpoint was reached by RSP or not.

void	rspq_syncpoint_wait (rspq_syncpoint_t sync_id)
	Wait until a syncpoint is reached by RSP.

void	rspq_wait (void)
	Wait until all commands in the queue have been executed by RSP.

void	rspq_dma_to_rdram (void *rdram_addr, uint32_t dmem_addr, uint32_t len, bool is_async)
	Enqueue a command to do a DMA transfer from DMEM to RDRAM.

void	rspq_dma_to_dmem (uint32_t dmem_addr, void *rdram_addr, uint32_t len, bool is_async)
	Enqueue a command to do a DMA transfer from RDRAM to DMEM.

rspq_write_t	rspq_write_begin (uint32_t ovl_id, uint32_t cmd_id, int size)
	Begin writing a new command into the RSP queue.

void	rspq_write_arg (rspq_write_t *w, uint32_t value)
	Add one argument to the command being enqueued.

void	rspq_write_end (rspq_write_t *w)
	Finish enqueuing a command into the queue.

Variables
rsp_ucode_t *	rspq_overlay_ucodes [RSPQ_MAX_OVERLAY_COUNT]
	RSPQ overlays.

rspq_ctx_t *	rspq_ctx
	Current context.

volatile uint32_t *	rspq_cur_pointer
	Copy of the current write pointer (see rspq_ctx_t)

volatile uint32_t *	rspq_cur_sentinel
	Copy of the current write sentinel (see rspq_ctx_t)

void *	rspq_rdp_dynamic_buffers [2]
	Buffers that hold outgoing RDP commands (generated via RSP).

rspq_block_t *	rspq_block
	Pointer to the current block being built, or NULL.

volatile int	__rspq_syncpoints_done
	ID of the last syncpoint reached by RSP.

Detailed Description

RSP Command queue.

RSP Queue: implementation

This documentation block describes the internal workings of the RSP Queue. This is useful to understand the implementation. For description of the API of the RSP queue, see rspq.h

Architecture

The RSP queue can be thought in abstract as a single contiguous memory buffer that contains RSP commands. The CPU is the writing part, which appends command to the buffer. The RSP is the reading part, which reads commands and execute them. Both work at the same time on the same buffer, so careful engineering is required to make sure that they do not interfere with each other.

The complexity of this library is trying to achieve this design without any explicit synchronization primitive. The basic design constraint is that, in the standard code path, the CPU should be able to just append a new command in the buffer without talking to the RSP, and the RSP should be able to just read a new command from the buffer without talking to the CPU. Obviously there are side cases where the synchronization is required (eg: if the RSP catches up with the CPU, or if the CPU fins that the buffer is full), but these cases should in general be rare.

To achieve a fully lockless approach, there are specific rules that the CPU has to follow while writing to make sure that the RSP does not get confused and execute invalid or partially-written commands. On the other hand, the RSP must be careful in discerning between a fully-written command and a partially-written command, and at the same time not waste memory bandwidth to continuously "poll" the buffer when it has caught up with the CPU.

The RSP uses the following algorithm to parse the buffer contents. Assume for now that the buffer is linear and unlimited in size.

The RSP fetches a "portion" of the buffer from RDRAM to DMEM. The size of the portion is RSPQ_DMEM_BUFFER_SIZE. It also resets its internal read pointer to the start of the DMEM buffer.
The RSP reads the first byte pointed by the internal read pointer. The first byte is the command ID. It splits it into overlay ID (4 bits) and command index (4 bits).
If the command is 0x00 (overlay 0, index 0), it means that the RSP has caught up with the CPU and there are no more pending commands.
- The RSP checks whether the signal SIG_MORE was set by the CPU. This signal is set any time the CPU writes a new command in the queue. If the signal is set, it means that the CPU has continued writing but the RSP has probably fetched the buffer before those commands were written. The RSP goes back to step 1 (refetch the buffer, from the current position).
- If SIG_MORE is not set, the RSP has really caught up the CPU, and no more commands are available in the queue. The RSP goes to sleep via the BREAK opcode, and waits for the CPU to wake it up when more commands are available.
- After the CPU has woken the RSP, it goes back to step 1.
If the overlay ID refers to an overlay which is not the currently loaded one, the RSP loads the new overlay into IMEM/DMEM. Before doing so, it also saves the current overlay's state back into RDRAM (this is a portion of DMEM specified by the overlay itself as "state", that is preserved across overlay switching).
The RSP uses the command index to fetch the "command descriptor", a small structure that contains a pointer to the function in IMEM that executes the command, and the size of the command in word.
If the command overflows the internal buffer (that is, it is longer than the number of bytes left in the buffer), it means that we need to refetch a subsequent portion of the buffer to see the whole command. Go back to step 1.
The RSP jumps to the function that executes the command. After the command is finished, the function is expected to jump back to the main loop, going to step 2.

Given the above algorithm, it is easy to understand how the CPU must behave when filling the buffer:

The buffer must be initialized with 0x00. This makes sure that unwritten portions of the buffers are seen as "special command 0x00" by the RSP.
The CPU must take special care not to write the command ID before the full command is written. For instance let's say a command is made by two words: 0xAB000001 0xFFFF8000 (overlay 0xA, command index 0xB, length 2). If the CPU writes the two words in the standard order, there might be a race where the RSP reads the memory via DMA when only the first word has been written, and thus see 0xAB000001 0x00000000, executing the command with a wrong second word. So the CPU has to write the first word as last (or at least its first byte must be written last).
It is important that the C compiler does not reorder writes. In general, compilers are allowed to change the order in which writes are performed in a buffer. For instance, if the code writes to buf[0], buf[1], buf[2], the compiler might decide to generate code that writes buf[2] first, for optimization reasons. It is possible to fix it using the MEMORY_BARRIER macro, or the volatile qualifier (which guarantees a fixed order of accesses between volatile pointers, though non-volatile accesses can be reordered freely also across volatile ones).

Internal commands

To manage the queue and implement all the various features, rspq reserves for itself the overlay ID 0x0 to implement internal commands. You can look at the list of commands and their description below. All command IDs are defined with RSPQ_CMD_* macros.

Buffer swapping

Internally, double buffering is used to implement the queue. The size of each of the buffers is RSPQ_DRAM_LOWPRI_BUFFER_SIZE. When a buffer is full, the queue engine writes a RSPQ_CMD_JUMP command with the address of the other buffer, to tell the RSP to jump there when it is done.

Moreover, just before the jump, the engine also enqueue a RSPQ_CMD_WRITE_STATUS command that sets the SP_STATUS_SIG_BUFDONE_LOW signal. This is used to keep track when the RSP has finished processing a buffer, so that we know it becomes free again for more commands.

This logic is implemented in rspq_next_buffer.

Blocks

Blocks are implemented by redirecting rspq_write to a different memory buffer, allocated for the block. The starting size for this buffer is RSPQ_BLOCK_MIN_SIZE. If the buffer becomes full, a new buffer is allocated with double the size (to achieve exponential growth), and it is linked to the previous buffer via a RSPQ_CMD_JUMP. So a block can end up being defined by multiple memory buffers linked via jumps.

Calling a block requires some work because of the nesting calls we want to support. To make the RSP ucode as short as possible, the two internal command dedicated to block calls (RSPQ_CMD_CALL and RSPQ_CMD_RET) do not manage a call stack by themselves, but only allow to save/restore the current queue position from a "save slot", whose index must be provided by the CPU.

Thus, the CPU has to make sure that each CALL opcode saves the position into a save slot which will not be overwritten by nested block calls. To do this, it calculates the "nesting level" of a block at block creation time: the nesting level of a block is defined by the smallest number greater than the nesting levels of all blocks that are called within the block itself. So for instance if a block calls another block whose nesting level is 5, it will get assigned a level of 6. The nesting level is then used as call slot in both all future calls to the block, and by the RSPQ_CMD_RET command placed at the end of the block itself.

Highpri queue

The high priority queue is implemented as an alternative couple of buffers, that replace the standard buffers when the high priority mode is activated.

When rspq_highpri_begin is called, the CPU notifies the RSP that it must switch to the highpri queues by setting signal SP_STATUS_SIG_HIGHPRI_REQUESTED. The RSP checks for that signal between each command, and when it sees it, it internally calls RSPQ_CMD_SWAP_BUFFERS. This command loads the highpri queue pointer from a special call slot, saves the current lowpri queue position in another special save slot, and finally clear SP_STATUS_SIG_HIGHPRI_REQUESTED and set SP_STATUS_SIG_HIGHPRI_RUNNING instead.

When the rspq_highpri_end is called, the opposite is done. The CPU writes in the queue a RSPQ_CMD_SWAP_BUFFERS that saves the current highpri pointer into its call slot, recover the previous lowpri position, and turns off SP_STATUS_SIG_HIGHPRI_RUNNING.

Some careful tricks are necessary to allow multiple highpri queues to be pending, see rspq_highpri_begin for details.

rdpq integrations

There are a few places where the rsqp code is hooked with rdpq to provide for coherent usage of the two peripherals. In particular:

rspq_wait automatically calls rdpq_fence. This means that it will also wait for RDP to finish executing all commands, which is actually expected for its intended usage of "full sync for debugging purposes".
All rsqp block creation functions call into hooks in rdpq. This is necessary because blocks are specially handled by rdpq via static buffer, to make sure RDP commands in the block don't passthrough via RSP, but are directly DMA from RDRAM into RDP. Moreover, See rdpq.c documentation for more details.
In specific places, we call into the rdpq debugging module to help tracing the RDP commands. For instance, when switching RDP RDRAM buffers, RSP will generate an interrupt to inform the debugging code that it needs to finish dumping the previous RDP buffer.

Data Structure Documentation

◆ rspq_overlay_header_t

struct rspq_overlay_header_t

The header of the overlay in DMEM.

This structure is placed at the start of the overlay in DMEM, via the RSPQ_OverlayHeader macros (defined in rsp_queue.inc).

Data Fields
uint16_t	state_start	Start of the portion of DMEM used as "state".
uint16_t	state_size	Size of the portion of DMEM used as "state".
uint16_t	command_base	Primary overlay ID used for this overlay.
uint16_t	reserved	Unused.
uint16_t	commands[]

◆ rspq_ctx_t

struct rspq_ctx_t

RSP queue building context.

This structure contains the state of a RSP queue as it is built by the CPU. It is instantiated two times: one for the lwopri queue, and one for the highpri queue. It contains the two buffers used in the double buffering scheme, and some metadata about the queue.

The current write pointer is stored in the "cur" field. The "sentinel" field contains the pointer to the last byte at which a new command can start, before overflowing the buffer (given RSPQ_MAX_COMMAND_SIZE). This is used for efficiently check when it is time to switch to the other buffer: basically, it is sufficient to check whether "cur > sentinel".

The current queue is stored in 3 global pointers: rspq_ctx, rspq_cur_pointer and rspq_cur_sentinel. rspq_cur_pointer and rspq_cur_sentinel are external copies of the "cur" and "sentinel" pointer of the current context, but they are kept as separate global variables for maximum performance of the hottest code path: rspq_write. In fact, it is much faster to access a global 32-bit pointer (via gp-relative offset) than dereferencing a member of a global structure pointer.

rspq_switch_context is called to switch between lowpri and highpri, updating the three global pointers.

When building a block, rspq_ctx is set to NULL, while the other two pointers point inside the block memory.

Data Fields
void *	buffers[2]	The two buffers used to build the RSP queue.
int	buf_size	Size of each buffer in 32-bit words.
int	buf_idx	Index of the buffer currently being written to.
uint32_t	sp_status_bufdone	SP status bit to signal that one buffer has been run by RSP.
uint32_t	sp_wstatus_set_bufdone	SP mask to set the bufdone bit.
uint32_t	sp_wstatus_clear_bufdone	SP mask to clear the bufdone bit.
volatile uint32_t *	cur	Current write pointer within the active buffer.
volatile uint32_t *	sentinel	Current write sentinel within the active buffer.

Macro Definition Documentation

◆ rspq_append1

#define rspq_append1	(	ptr,
		cmd,
		arg1
	)

Value:

    ({ \
    ((volatile uint32_t*)(ptr))[0] = ((cmd)<<24) | (arg1); \
    ptr += 1; \
})

Smaller version of rspq_write that writes to an arbitrary pointer.

◆ rspq_append2

#define rspq_append2	(	ptr,
		cmd,
		arg1,
		arg2
	)

Value:

    ({ \
    ((volatile uint32_t*)(ptr))[1] = (arg2); \
    ((volatile uint32_t*)(ptr))[0] = ((cmd)<<24) | (arg1); \
    ptr += 2; \
})

Smaller version of rspq_write that writes to an arbitrary pointer.

◆ rspq_append3

#define rspq_append3	(	ptr,
		cmd,
		arg1,
		arg2,
		arg3
	)

Value:

    ({ \
    ((volatile uint32_t*)(ptr))[1] = (arg2); \
    ((volatile uint32_t*)(ptr))[2] = (arg3); \
    ((volatile uint32_t*)(ptr))[0] = ((cmd)<<24) | (arg1); \
    ptr += 3; \
})

Smaller version of rspq_write that writes to an arbitrary pointer.

Function Documentation

◆ DEFINE_RSP_UCODE()

DEFINE_RSP_UCODE	(	rsp_queue	,
		.	crash_handler = `rspq_crash_handler`,
		.	assert_handler = `rspq_assert_handler`
	)

The RSPQ ucode

◆ rspq_init()

void rspq_init ( void )

Initialize the RSPQ library.

This should be called by the initialization functions of the higher-level libraries using the RSP command queue. It can be safely called multiple times without side effects.

It is not required by applications to call this explicitly in the main function.

◆ rspq_close()

void rspq_close ( void )

Shut down the RSPQ library.

This is mainly used for testing.

◆ rspq_overlay_get_state()

void * rspq_overlay_get_state ( rsp_ucode_t * overlay_ucode )

Return a pointer to the overlay state (in RDRAM)

Overlays can define a section of DMEM as persistent state. This area will be preserved across overlay switching, by reading back into RDRAM the DMEM contents when the overlay is switched away.

This function returns a pointer to the state area in RDRAM (not DMEM). It is meant to modify the state on the CPU side while the overlay is not loaded. The layout of the state and its size should be known to the caller.

To avoid race conditions between overlay state access by CPU and RSP, this function first calls rspq_wait to force a full sync and make sure the RSP is idle. As such, it should be treated as a debugging function.

Parameters

overlay_ucode The ucode overlay for which the state pointer will be returned.

Returns: Pointer to the overlay state (in RDRAM). The pointer is returned in the cached segment, so make sure to handle cache coherency appropriately.

◆ __rspq_get_state()

rsp_queue_t * __rspq_get_state ( void )

Return a pointer to a copy of the current RSPQ state.

Note: This function forces a full sync by calling rspq_wait to avoid race conditions.

◆ rspq_overlay_register()

uint32_t rspq_overlay_register ( rsp_ucode_t * overlay_ucode )

Register a rspq overlay into the RSP queue engine.

This function registers a rspq overlay into the queue engine. An overlay is a RSP ucode that has been written to be compatible with the queue engine (see rsp_queue.inc for instructions) and is thus able to execute commands that are enqueued in the queue. An overlay doesn't have a single entry point: it exposes multiple functions bound to different commands, that will be called by the queue engine when the commands are enqueued.

The function returns the overlay ID, which is the ID to use to enqueue commands for this overlay. The overlay ID must be passed to rspq_write when adding new commands. rspq allows up to 16 overlays to be registered simultaneously, as the overlay ID occupies the top 4 bits of each command. The lower 4 bits specify the command ID, so in theory each overlay could offer a maximum of 16 commands. To overcome this limitation, this function will reserve multiple consecutive IDs in case an overlay with more than 16 commands is registered. These additional IDs are silently occupied and never need to be specified explicitly when queueing commands.

For example if an overlay with 32 commands were registered, this function could return ID 0x60, and ID 0x70 would implicitly be reserved as well. To queue the twenty first command of this overlay, you would write rspq_write(ovl_id, 0x14, ...), where ovl_id is the value that was returned by this function.

Parameters

overlay_ucode The overlay to register

Returns: The overlay ID that has been assigned to the overlay. Note that this value will be preshifted by 28 (eg: 0x60000000 for ID 6), as this is the expected format used by rspq_write.

◆ rspq_overlay_register_static()

void rspq_overlay_register_static	(	rsp_ucode_t *	overlay_ucode,
		uint32_t	overlay_id
	)

Register an overlay into the RSP queue engine assigning a static ID to it.

This function works similar to rspq_overlay_register, except it will attempt to assign the specified ID to the overlay instead of automatically choosing one. Note that if the ID (or a consecutive IDs) is already used by another overlay, this function will assert, so careful usage is advised.

Assigning a static ID can mostly be useful for debugging purposes.

Parameters

overlay_ucode	The ucode to register
overlay_id	The ID to register the overlay with. This ID must be preshifted by 28 (eg: 0x40000000).

See also: rspq_overlay_register

◆ rspq_overlay_unregister()

void rspq_overlay_unregister ( uint32_t overlay_id )

Unregister a ucode overlay from the RSP queue engine.

This function removes an overlay that has previously been registered with rspq_overlay_register or rspq_overlay_register_static from the queue engine. After calling this function, the specified overlay ID (and consecutive IDs in case the overlay has more than 16 commands) is no longer valid and must not be used to write new commands into the queue.

Note that when new overlays are registered, the queue engine may recycle IDs from previously unregistered overlays.

Parameters

overlay_id The ID of the ucode (as returned by rspq_overlay_register) to unregister.

◆ rspq_next_buffer()

void rspq_next_buffer ( void )

Switch to the next write buffer for the current RSP queue.

This function is invoked by rspq_write when the current buffer is full, that is, when the write pointer (rspq_cur_pointer) reaches the sentinel (rspq_cur_sentinel). This means that we cannot safely write any more new command in the buffer (the remaining bytes are less than the maximum command size), and thus a new buffer must be configured.

If we're creating a block, we need to allocate a new buffer from the heap. Otherwise, if we're writing into either the lowpri or the highpri queue, we need to switch buffer (double buffering strategy), making sure the other buffer has been already fully executed by the RSP.

◆ rspq_flush()

void rspq_flush ( void )

Make sure that RSP starts executing up to the last written command.

RSP processes the command queue asynchronously as it is being written. If it catches up with the CPU, it halts itself and waits for the CPU to notify that more commands are available. On the contrary, if the RSP lags behind it might keep executing commands as they are written without ever sleeping. So in general, at any given moment the RSP could be crunching commands or sleeping waiting to be notified that more commands are available.

This means that writing a command via rspq_write is not enough to make sure it is executed; depending on timing and batching performed by RSP, it might either be executed automatically or not. rspq_flush makes sure that the RSP will see it and execute it.

This function does not block: it just make sure that the RSP will run the full command queue written until now. If you need to actively wait until the last written command has been executed, use rspq_wait.

It is suggested to call rspq_flush every time a new "batch" of commands has been written. In general, it is not a problem to call it often because it is very very fast (takes only ~20 cycles). For instance, it can be called after every rspq_write without many worries, but if you know that you are going to write a number of subsequent commands in straight line code, you can postpone the call to rspq_flush after the whole sequence has been written.

   // This example shows some code configuring the lights for a scene.
   // The command in this sample is called CMD_SET_LIGHT and requires
   // a light index and the RGB colors for the list to update.
uint32_t gfx_overlay_id;   
 
   #define CMD_SET_LIGHT  0x7
 
   for (int i=0; i<MAX_LIGHTS; i++) {
       rspq_write(gfx_overlay_id, CMD_SET_LIGHT, i,
           (lights[i].r << 16) | (lights[i].g << 8) | lights[i].b);
   }
   
   // After enqueuing multiple commands, it is sufficient
   // to call rspq_flush once to make sure the RSP runs them (in case
   // it was idling).
   rspq_flush();

Note: This is an experimental API. In the future, it might become a no-op, and flushing could happen automatically at every rspq_write. We are keeping it separate from rspq_write while experimenting more with the RSPQ API.; This function is a no-op if it is called while a block is being recorded (see rspq_block_begin / rspq_block_end). This means calling this function in a block recording context will not guarantee the execution of commands that were queued prior to starting the block.

◆ rspq_highpri_begin()

void rspq_highpri_begin ( void )

Start building a high-priority queue.

This function enters a special mode in which a high-priority queue is activated and can be filled with commands. After this function has been called, all commands will be put in the high-priority queue, until rspq_highpri_end is called.

The RSP will start processing the high-priority queue almost instantly (as soon as the current command is done), pausing the normal queue. This will also happen while the high-priority queue is being built, to achieve the lowest possible latency. When the RSP finishes processing the high priority queue (after rspq_highpri_end closes it), it resumes processing the normal queue from the exact point that was left.

The goal of the high-priority queue is to either schedule latency-sensitive commands like audio processing, or to schedule immediate RSP calculations that should be performed right away, just like they were preempting what the RSP is currently doing.

It is possible to create multiple high-priority queues by calling rspq_highpri_begin / rspq_highpri_end multiple times with short delays in-between. The RSP will process them in order. Notice that there is a overhead in doing so, so it might be advisable to keep the high-priority mode active for a longer period if possible. On the other hand, a shorter high-priority queue allows for the RSP to switch back to processing the normal queue before the next one is created.

Note: It is not possible to create a block while the high-priority queue is active. Arrange for constructing blocks beforehand.; It is currently not possible to call a block from the high-priority queue. (FIXME: to be implemented)

◆ rspq_highpri_end()

void rspq_highpri_end ( void )

Finish building the high-priority queue and close it.

This function terminates and closes the high-priority queue. After this command is called, all following commands will be added to the normal queue.

Notice that the RSP does not wait for this function to be called: it will start running the high-priority queue as soon as possible, even while it is being built.

◆ rspq_highpri_sync()

void rspq_highpri_sync ( void )

Wait for the RSP to finish processing all high-priority queues.

This function will spin-lock waiting for the RSP to finish processing all high-priority queues. It is meant for debugging purposes or for situations in which the high-priority queue is known to be very short and fast to run. Also note that it is not possible to create syncpoints in the high-priority queue.

◆ rspq_block_begin()

void rspq_block_begin ( void )

Begin creating a new block.

This function begins writing a command block (see rspq_block_t). While a block is being written, all calls to rspq_write will record the commands into the block, without actually scheduling them for execution. Use rspq_block_end to close the block and get a reference to it.

Only one block at a time can be created. Calling rspq_block_begin twice (without any intervening rspq_block_end) will cause an assert.

During block creation, the RSP will keep running as usual and execute commands that have been already added to the queue.

Note: Calls to rspq_flush are ignored during block creation, as the RSP is not going to execute the block commands anyway.

◆ rspq_block_end()

rspq_block_t * rspq_block_end ( void )

Finish creating a block.

This function completes a block and returns a reference to it (see rspq_block_t). After this function is called, all subsequent rspq_write will resume working as usual: they will add commands to the queue for immediate RSP execution.

To run the created block, use rspq_block_run.

Returns: A reference to the just created block

See also: rspq_block_begin; rspq_block_run

◆ rspq_block_free()

void rspq_block_free ( rspq_block_t * block )

Free a block that is not needed any more.

After calling this function, the block is invalid and must not be called anymore.

Parameters

block The block

Note: If the block was being called by other blocks, these other blocks become invalid and will make the RSP crash if called. Make sure that freeing a block is only done when no other blocks reference it.

◆ rspq_block_run()

void rspq_block_run ( rspq_block_t * block )

Add to the RSP queue a command that runs a block.

This function runs a block that was previously created via rspq_block_begin and rspq_block_end. It schedules a special command in the queue that will run the block, so that execution of the block will happen in order relative to other commands in the queue.

Blocks can call other blocks. For instance, if a block A has been fully created, it is possible to call rspq_block_run(A) at any point during the creation of a second block B; this means that B will contain the special command that will call A.

Parameters

block The block that must be run

Note: The maximum depth of nested block calls is 8.

◆ rspq_noop()

void rspq_noop ( void )

Enqueue a no-op command in the queue.

This function enqueues a command that does nothing. This is mostly useful for debugging purposes.

◆ rspq_syncpoint_new()

rspq_syncpoint_t rspq_syncpoint_new ( void )

Create a syncpoint in the queue.

This function creates a new "syncpoint" referencing the current position in the queue. It is possible to later check when the syncpoint is reached by the RSP via rspq_syncpoint_check and rspq_syncpoint_wait.

Returns: ID of the just-created syncpoint.

Note: It is not possible to create a syncpoint within a block because it is meant to be a one-time event. Otherwise the same syncpoint would potentially be triggered multiple times, which is not supported.; It is not possible to create a syncpoint from the high-priority queue due to the implementation requiring syncpoints to be triggered in the same order they have been created.

See also: rspq_syncpoint_t

◆ rspq_syncpoint_check()

bool rspq_syncpoint_check ( rspq_syncpoint_t sync_id )

Check whether a syncpoint was reached by RSP or not.

This function checks whether a syncpoint was reached. It never blocks. If you need to wait for a syncpoint to be reached, use rspq_syncpoint_wait instead of polling this function.

Parameters

[in] sync_id ID of the syncpoint to check

Returns: true if the RSP has reached the syncpoint, false otherwise

See also: rspq_syncpoint_t

◆ rspq_syncpoint_wait()

void rspq_syncpoint_wait ( rspq_syncpoint_t sync_id )

Wait until a syncpoint is reached by RSP.

This function blocks waiting for the RSP to reach the specified syncpoint. If the syncpoint was already called at the moment of call, the function exits immediately.

Parameters

[in] sync_id ID of the syncpoint to wait for

See also: rspq_syncpoint_t

◆ rspq_wait()

void rspq_wait ( void )

Wait until all commands in the queue have been executed by RSP.

This function blocks until all commands present in the queue have been executed by the RSP and the RSP is idle. If the queue contained also RDP commands, it also waits for those commands to finish drawing.

This function exists mostly for debugging purposes. Calling this function is not necessary, as the CPU can continue adding commands to the queue while the RSP is running them. If you need to synchronize between RSP and CPU (eg: to access data that was processed by RSP) prefer using rspq_syncpoint_new / rspq_syncpoint_wait which allows for more granular synchronization.

◆ rspq_dma_to_rdram()

void rspq_dma_to_rdram	(	void *	rdram_addr,
		uint32_t	dmem_addr,
		uint32_t	len,
		bool	is_async
	)

Enqueue a command to do a DMA transfer from DMEM to RDRAM.

Parameters

	rdram_addr	The RDRAM address (destination, must be aligned to 8)
[in]	dmem_addr	The DMEM address (source, must be aligned to 8)
[in]	len	Number of bytes to transfer (must be multiple of 8)
[in]	is_async	If true, the RSP does not wait for DMA completion and processes the next command as the DMA is in progress. If false, the RSP waits until the transfer is finished before processing the next command.

Note: The argument is_async refers to the RSP only. From the CPU standpoint, this function is always asynchronous as it just adds a command to the queue.

◆ rspq_dma_to_dmem()

void rspq_dma_to_dmem	(	uint32_t	dmem_addr,
		void *	rdram_addr,
		uint32_t	len,
		bool	is_async
	)

Enqueue a command to do a DMA transfer from RDRAM to DMEM.

Parameters

[in]	dmem_addr	The DMEM address (destination, must be aligned to 8)
	rdram_addr	The RDRAM address (source, must be aligned to 8)
[in]	len	Number of bytes to transfer (must be multiple of 8)
[in]	is_async	If true, the RSP does not wait for DMA completion and processes the next command as the DMA is in progress. If false, the RSP waits until the transfer is finished before processing the next command.

Note: The argument is_async refers to the RSP only. From the CPU standpoint, this function is always asynchronous as it just adds a command to the queue.

◆ rspq_write_begin()

rspq_write_t rspq_write_begin	(	uint32_t	ovl_id,
		uint32_t	cmd_id,
		int	size
	)

externinline

Begin writing a new command into the RSP queue.

This command initiates a sequence to enqueue a new command into the RSP queue. Call this command passing the overlay ID and command ID of the command to create. Then, call rspq_write_arg once per each argument word that composes the command. Finally, call rspq_write_end to finalize and enqueue the command.

A sequence made by rspq_write_begin, rspq_write_arg, rspq_write_end is functionally equivalent to a call to rspq_write, but it allows to create bigger commands, and might better fit some situations where arguments are calculated on the fly. Performance-wise, the code generated by rspq_write_begin + rspq_write_arg + rspq_write_end should be very similar to a single call to rspq_write, though just a bit slower. It is advisable to use rspq_write whenever possible.

Make sure to read the documentation of rspq_write as well for further details.

Parameters

ovl_id	The overlay ID of the command to enqueue. Notice that this must be a value preshifted by 28, as returned by rspq_overlay_register.
cmd_id	Index of the command to call, within the overlay.
size	The size of the commands in 32-bit words

Returns: A write cursor, that must be passed to rspq_write_arg and rspq_write_end

See also: rspq_write_arg; rspq_write_end; rspq_write

◆ rspq_write_arg()

void rspq_write_arg	(	rspq_write_t *	w,
		uint32_t	value
	)

externinline

Add one argument to the command being enqueued.

This function adds one more argument to the command currently being enqueued. This function must be called after rspq_write_begin; it should be called multiple times (one per argument word), and then rspq_write_end should be called to terminate enqueuing the command.

See also rspq_write for a more straightforward API for command enqueuing.

Parameters

w	The write cursor (returned by rspq_write_begin)
value	New 32-bit argument word to add to the command.

Note: The first argument must have its MSB set to 0, to leave space for the command ID. See rspq_write documentation for a more complete explanation.

See also: rspq_write_begin; rspq_write_end; rspq_write

◆ rspq_write_end()

void rspq_write_end ( rspq_write_t * w )

externinline

Finish enqueuing a command into the queue.

This function should be called to terminate a sequence for command enqueuing, after rspq_write_begin and (multiple) calls to rspq_write_arg.

After calling this command, the write cursor cannot be used anymore.

Parameters

w	The write cursor (returned by rspq_write_begin)

See also: rspq_write_begin; rspq_write_arg; rspq_write

Data Structures

Macros

Functions

Variables

Detailed Description

RSP Queue: implementation

Architecture

Internal commands

Buffer swapping

Blocks

Highpri queue

rdpq integrations

Data Structure Documentation

◆ rspq_overlay_header_t

◆ rspq_ctx_t

Macro Definition Documentation

◆ rspq_append1

◆ rspq_append2

◆ rspq_append3

Function Documentation

◆ DEFINE_RSP_UCODE()

◆ rspq_init()

◆ rspq_close()

◆ rspq_overlay_get_state()

◆ __rspq_get_state()

◆ rspq_overlay_register()

◆ rspq_overlay_register_static()

◆ rspq_overlay_unregister()

◆ rspq_next_buffer()

◆ rspq_flush()

◆ rspq_highpri_begin()

◆ rspq_highpri_end()

◆ rspq_highpri_sync()

◆ rspq_block_begin()

◆ rspq_block_end()

◆ rspq_block_free()

◆ rspq_block_run()

◆ rspq_noop()

◆ rspq_syncpoint_new()

◆ rspq_syncpoint_check()

◆ rspq_syncpoint_wait()

◆ rspq_wait()

◆ rspq_dma_to_rdram()

◆ rspq_dma_to_dmem()

◆ rspq_write_begin()

◆ rspq_write_arg()

◆ rspq_write_end()