www.riscos.com - Acorn Computer Archives

Phoebe / Ursula: An Introduction

It is intended that the transition from Risc PC / RISC OS 3.7 to Phoebe / Ursula should be significantly easier for third-party applications to make than the transition from Risc PC / RISC OS 3.6 with ARM710 to Risc PC / RISC OS 3.7 with StrongARM. The vast majority of the problems observed with third-party applications in the move from ARM710 to StrongARM were connected with the architecture of the new CPU; as Ursula supports only ARM Architecture 4 (ie StrongARM) and Phoebe will ship with a StrongARM, problems of this specific nature will not arise at release time.

From a low-level perspective, the SA110 in Phoebe is functionally identical to the SA110 in Risc PC (with the single exception that the STM^ problem has been resolved), Phoebe's VIDC20R is identical at register level to Risc PC's VIDC20 / 20A, IOMD2 is a register-level superset of IOMD, and with the exception of the hard disc PIO modes available, the SMC PIO should be backwardly compatible with the SMC 37665 (functions it implements are identical at the SWI layer). Control of the keyboard and mouse has been moved into the SMC chip for Phoebe; in Risc PC, it was handled by IOMD. Very little, if any, third-party software will need to worry about changes at the hardware register level.

However, it must be noted that legacy code which will not work on a StrongARM Risc PC will also not work on Phoebe; see Application Note 295 for details of resolving the problems of self-modifying code and pipeline lengths.

Incompatibilities with Risc PC

Phoebe has two serial ports, numbered 1 and 2. To handle this, a new set of calls which take the form of Unix-style IOCTLs have been added to FileSwitch and DeviceFS; it is recommended that code which interacts with serial ports use this interface and treat the serial ports as streaming devices rather than use the old-style OS_SerialOp calls (which are only able to understand one serial port, but see following sections). It should also be noted that Port 2 is missing its DSR line, and this port is also taken over by the IrDA interface when IrDA is in use; the function of Port 2 is assigned to the infrared interface or the 9-pin D-type socket using an IOCTL, and defaults to the 9-pin socket.

As CDFS has been completely rewritten, third-party CD ROM drivers and CD audio players will need to be examined to see whether they need rework to fit in with the new API.

CDFS 3 (to be shipped with Ursula) has no support for SCSI CD ROM drives, although drivers can be written for such devices to fit into its scheme of doing things.

A new mechanism for handling service calls has been introduced, which requires the format of relocatable modules which make use of service calls to be changed if service calls are to be despatched efficiently. Although leaving a module which uses service calls "as is" will not cause the system to break, it will cause it to slow down significantly (although not to a speed less than that of any other version of RISC OS; it will merely remove the performance advantage the new handler mechanism gives). The changes which need to be made to such a module are described in the Ursula kernel spec; in overview, a pointer is set in the module header which informs the kernel that a lookup table of the service calls handled by the module is present somewhere in the module body.

Names of dynamic areas are now truncated to a maximum of 31 characters (although this is not expected to affect many applications at all!).

Memory protection of kernel workspace has been improved, particularly with respect to zero page; in particular, address zero (the reset vector) is now protected from User mode reads as well as writes. This will only affect software which is intrinsically broken, however whereas it has previously been feasible to get away with a modicum of brokenness in this area, it no longer will be. Failure of RISC OS 3.7-compatible software under test on Ursula owing to address exceptions or data aborts is likely to be symptomatic of this type of problem, or a problem with dynamic areas (see immediately below).

Dynamic areas can now be bound to the application which creates them. This may raise issues for software which attempts to access dynamic areas "owned" by other applications; the binding system, and the ability to have Sparse areas (see the Ursula Kernel spec), means that dynamic areas are now no longer guaranteed to reside at disjoint addresses, nor always be accessible, nor always be mapped from their base addr. Software which reads another application's area without bearing this in mind may read garbage or cause a data abort.

FileCore

The rework of FileCore to support filenames >10 chars and directories of >77 objects has some interesting repercussions for applications....

* The location of the root directory on the disc has moved. This means that disc editors and recovery programs will break on discs which support long filenames and directories, although the intention is that they should fail without causing harm (they will be unable to read anything on startup which looks like a directory, and should halt at this point).

* Any application code which makes assumptions about lengths of filenames or numbers of objects in directories based on limits imposed by the RISC OS 3.7 FileCore is likely to break. Where directory contents (filenames etc) need to be enumerated, they should be enumerated into an area of memory which can be grown dynamically under the feet of the enumeration process.

For the purpose of testing, it is sufficient to ensure that code works correctly when reading from and writing to a filing system which has a similar lack of restrictions to the new FileCore (eg NFS, which, like FileCore, can support filenames of up to 255 characters and a practically unlimited number of objects per directory).

The SWI interface for such things as defect mapping should remain unchanged.

Side-effects of these changes which have dependencies on other unmodified parts of the system are:

* Sprite names remain restricted to a maximum length of 12 characters.

Therefore if an application wishes to IconSprites a !<appname> sprite to be placed in a filer display, the application name must be unique within its first 12 characters (the application name can be longer, but the sprite mapping will only take the first 12 characters of it into consideration).

* The format and size of Wimp messages will not be changed in Ursula.

Therefore, when files are loaded and saved from Desktop applications, either between themselves or to and from the Filer, their fully-resolved pathnames must not overflow the Wimp message block. The Wimp block is able to support pathnames of about 200 characters.

Things You Might Expect to Cause Problems But Which Actually Don't OS_SerialOp is only capable of understanding one serial port, however OS_SerialOp calls in existing software will continue to drive Port 1 in the expected manner provided that the new IOCtl style of call (qv) is not also attempting to drive Port 1. Therefore if a piece of software is unable to drive a peripheral connected to Port 2, swap the peripheral to Port 1 and it should work.

The keyboard driver has been reworked to provide more comprehensive support for PS/2 keyboards. The SWI interface and handler locations remain unchanged from Risc PC, so all this will continue to work. As the PS/2 interface forbids direct keyboard scanning by other methods, no opportunity is given for complications to arise. Ursula's keyboard handler has support for 105-key keyboards (those which have the extra keys in the region of the spacebar which make great meta keys for emacs...), however these additional keys are mapped to currently-unused numbers and thus the mapping of pre-existing keys is unchanged.

Mouse support has also been changed below SWI level; the SWI interface and event handlers remain the same, however. As with the keyboard, direct access is forbidden by the PS/2 interface.

The area of memory reserved for the screen is now cached, to decrease redraw times. This does not affect shadow screens or video bank switching. However, hardware scrolling would cause screen corruption on a cached screen; this is fixed by the kernel being able to detect an attempt to perform a hardware scroll so that it can uncache the screen before the scroll takes place.

Therefore, although hardware scrolling will continue to work, screen update will slow down as a result.

MIDI support will be implemented as a set of SWIs identical to those provided in the software supporting the original Acorn MIDI card (with the exception of any handling of the Thru path, which will be made to do nothing).

Support for the Euro currency symbol is provided; it is included as character &F0 is the Sidney (Symbol) font, it is likely to go in the system font at &80, and we're also keeping an eye on the proposed ISO Latin0 standard (which will carry it at &84) for implementation if ratified.

The size of the command line buffer has been increased so that operations such as *copy <large filename> <other large filename> <flags> can be executed without running into the 256-character limit imposed on earlier OS versions.

The limit on command line length is now 1024 characters.

Memory allocation has been tight in RISC OS >=3.5 when in the Supervisor language environment, particularly bearing in mind the processes a machine goes through in the pre-desktop phase of the boot sequence. The size of the Supervisor stack in Ursula has therefore been increased from 8K to 32K.

Floating Point

To take advantage of Ursula's capability to execute 32-bit ARM code which can be located above 26-bit addressable space (see below), the floating point emulator has been rewritten in 32-bit ARM code. This is expected to give a software speed increase (not counting the increase in speed which will be seen as a result of the faster hardware) of about 10% - 15%, owing to the following factors:

* 32 bit multiplies to give a 64 bit result execute quicker in 32 bit modes.

* The FPE is more tightly coupled to the undefined instruction vector...

...which can be illustrated by the following example:

As things stand in RISC OS 3.7 and earlier, what happens when a floating point instruction is encountered is something like:

"USR26" mode: Undefined instruction found; passes to:

UND32 mode : Exception handler; exception details passed to vector "USR26" mode: FPEmulator recognises the instruction, traps and executes UND32 mode : FPEmulator exit veneer "USR26" mode: Back to the application code which contained the FP instruction.

This all happens because RISC OS 3.x runs the ARM in an unusual mode by default; it doesn't run it in full 26-bit emulation, but runs it as a 32-bit chip with only exception handlers in the 32-bit domain. Switching processor modes between addressing domains wastes cycles.

Now, what you get under Ursula is:

"USR26" mode: Undefined instruction found; passes to:

UND32 mode : Exception handler; exception details passed to vector UND32 mode : FPEmulator recognises the instruction, traps and executes UND32 mode : Exit veneer "USR26" mode: Back to the application code which contained the FP instruction.

Note that this scheme only has two CPU mode changes compared to the four changes in the previous scheme; as this mode changing has to take place for every FP instruction in the original application code, the result is a significantly quicker FPE.

Guidelines for Authoring Software for Phoebe / Ursula * Code written in C or BASIC will generally work, but see issues (particularly FileCore ones) above.

* Code using the shared C library or Toolbox is generally fine. Using RISCOS_Lib may cause problems, depending on the variant. Anything linked with ANSILib will not work. .

* A module patcher will be supplied with Ursula that is capable of detecting whether a module handles service calls, if so, whether it conforms to the new module standard, and then if not, patching it. This is only a temporary solution though. If you find your software runs at a reasonable speed and does not slow the rest of the machine down only because the patcher is patching it, you should modify it so it does not need patching (modules should be fixed at source wherever possible). For modules written in C, a new CMHG is available for this purpose.

* RISC OS currently has the problem that on a machine equipped with a large amount of physical RAM, clients may rapidly exhaust logical address space when creating dynamic areas. This arises from the common practice of creating areas with a maximum size of -1, meaning that the area can have a maximum size corresponding to the physical RAM capacity of the machine. Because 32-bit address space is only 4Gb (roughly half of which is available for dynamic areas), machines with large amounts of physical RAM may fail to launch applications simply because they are unable to create areas with sufficient address space. Therefore use of -1 to denote maximum size when calling OS_DynamicArea 1 (create) is now deprecated; Acorn strongly recommends that future applications attempt to offer a sensible maximum size.

* The new !Configure system can be extended by writing plugins, however its efficiency of use drops if every application starts installing !Configure plugins solely to configure the application itself. It is only appropriate to write a !Configure plugin if its use is restricted to configuring system-wide resources.

* The new FPE is expected to be 10% - 15% quicker in software than the current one; this does not take into account the fact that the hardware will also be running more quickly.

* It is now possible to write and execute 32-bit ARM code, via access to USR_32. 32-bit code can be put in a dynamic area, a new SWI can be issued from the 26-bit world to enter USR_32 from where the code can be jumped to and executed. There is also a SWI to call from the 32-bit world to return to the 26-bit world, but this call must only be made from a code stub which is existing within 26-bit address space. Note that writing 32-bit code is only appropriate for ARM cognoscenti who are also brave...

it is very unlikely to be needed by or particularly beneficial to conventional applications.

* Workspace required by applications or modules is able to live within the wimpslot reserved by the application (for applications), within the RMA, or within a defined and claimed dynamic area. For workspace which is not going to grow above about 64K and which can be statically allocated at task startup time, application space or RMA is the most appropriate place to put it. When deciding where to put an area of workspace, the following points should be considered:

* Is the workspace likely to need to grow? If so, regardless of its initial size, it's better to allocate it in reserved-wimpslot space or as a dynamic area. Things to be borne in mind here are:

* application space doesn't "fragment" by its intrinsic nature, but having workspace in application space causes task swapping on wimp polls to slow down and the machine to slow down as a result (until lazy task swapping is implemented; see below) * the RMA can have fixed-size blocks slide such that a reasonable number of small blocks of fixed size will pack efficiently, but growing an existing area can result in major fragmentation * although a dynamic area is the best place to put an area of workspace which is likely to need to grow, the method by which areas are allocated can, if a large number of areas are allocated, cause problems when subsequently allocating an area which is large (above 64K) owing to fragmentation.

* Is the workspace of a fixed size, but large (above 64K)? If so, it's better to allocate it as a dynamic area, otherwise it can go in application space or the RMA.

* Would it be beneficial for more than one application to be able to access the same area of workspace? If so, it will need to be allocated as a dynamic area.

* Avoid using the system sprite pool as workspace, if you can; this area has been modified such that it can no longer be extended beyond a 16Mb high-water mark to avoid address space exhaustion.

* For future iterations of Ursula, it's worth bearing in mind the concept of lazy task swapping; this will vastly reduce the slowdown associated with paging large applications in and out of application space on wimp polls, and thus increase the preferability of having growable areas of unshared workspace within application space. (For those interested, the idea is to implement task swapping as a demand-paged system; swapping to a new task will do nothing until an instruction or data fetch abort is encountered, at which point the page containing the instruction to be fetched is paged in and the fetching instruction re-started...) .

Return to section Index