Bug in staging vme_user

Mon Feb 25 18:35:43 UTC 2013

On 25/02/13 17:00, ternaryd wrote:
> On Mon, 25 Feb 2013 14:18:21 +0000
> Martyn Welch <martyn.welch at ge.com> wrote:
> 
>> The address space, the cycle type and the data width fields should be
>> a OR'ed set of the capabilities you require from the resource. Not
>> all VME windows are equal, for example on some VME bridges only
>> specific windows are capable of A16 operation. If you are going to
>> user A16, you need to specify that to ensure you get a compatible
>> resource.
> 
> The resource is allocated in vme_user_probe() and vme_master_request()
> has arguments VME_A32, VME_SCT, VME_D32. I do remember previous trials
> where this failed for VME_A16, which isn't compatible, is it?. But right
> now, I've stepped back to a stock kernel and a16 was accepted.
> Something must have changed, I'm not aware of.
> 

It's attempting to allocate 4 windows from memory. There are only 2
windows in the tsi148 that can address A16. So, without other
modifications the 3rd call requesting a A16 capable window will fail.

> I can imagine, that hardware dependent settings for slave windows can
> be established at boot time. But I can't in case of master windows, as
> those settings also depend on the slave and can change in time.
> 

Both need to be done via the driver interface and not via the u-boot
commands. The drivers do not expect windows to be setup in advance.

>>> What I did not understand is, why image->size_buf is set to
>>> PCI_BUF_SIZE (0x20000).
> 
>> Because that's the size of the buffer (memory allocation) that the
>> vme_user module is allocating for the slave windows:
> 
>>         image[i].kern_buf = vme_alloc_consistent(image[i].resource,
>>                 image[i].size_buf, &image[i].pci_buf);
> 
> OK. This isn't an image representing the slave window, but memory for
> the kernel talking to PCI. Got it.
> 

To make sure were on the same page:

Memory is allocated in the driver (vme_user), this is used to provide
the memory exposed via slave windows on to the VME bus.
vme_alloc_consistent() provides addresses for the allocation in the PCI
address space and virtual kernel address space (I hope my terminology is
correct there). The kernel address is stored in kern_buf and allows the
driver to access the memory (essentially to allow it to be read via
s0-s4) and the PCI address is stored in pci_buf, which is needed to
configure the slave window.

>> You will effectively be setting the base address as 0x0000. The upper
>> 16 bits of that address aren't going to be compared when using the
>> A16 address space.
> 
> Your advices have always been sound. I just stepped back to a stock
> kernel, applied the minimum patches for this board (not related to vme)
> and compiled a fresh kernel with vme_user as a module, everything else
> compiled in. The kernel command line was 
> 
> root=/dev/nfs rw nfsroot=192.168.1.9:/srv/nfs/deb-dev ip=192.168.1.124:192.168.1.9:192.168.1.20:255.255.255.0:vme1022:eth0:off panic=1 console=ttyS0,115200n8 vme_tsi148.geoid=1 vme_tsi148.err_chk=1
> 
> Loading the module with bus=0 worked fine. The arguments to
> VME_SET_MASTER where a16, sct|super|data, d16. Also this was accepted
> without error message. Then some lseeks and writes, and the board
> locked up. This time, I forgot to include the lockdep options, but when
> I did earlier, no further messages where sent. But some 20 minutes
> later, the board rebooted, so the watchdog was still alive.
> Unfortunately, not even the console argument helped.

I'd suggest doing some reads first, assuming there are some registers on
the board which you can read and for which you can guess values or
partial values for confirmation. Do single byte/word read to begin with,
check for error messages in dmesg and check the returned data.

You can also use pwrite() & pread() which may be easier than doing
lseeks and reads/writes.

> 
>> It sounds like your hardware doesn't have the geographic lines wired
>> to the tsi148. Some vendors wire them to an FPGA or other logic and
>> have a custom method to read the geographic address.
> 
> This is, what I get during boot:
> 
> vme_tsi148 0001:08:0e.0: Board is not the VME system controller

Hmm, that's odd. There aren't any u-boot variables setup to force this
behaviour is there? (TBH, I can't see anything in the u-boot source I
can see to suggest that there is likely to be.)

> vme_tsi148 0001:08:0e.0: VME geographical address is set to 1

I assume this is 0 when you don't set "vme_tsi148.geoid=1"

> vme_tsi148 0001:08:0e.0: VME Write and flush and error check is enabled
> vme_tsi148 0001:08:0e.0: CR/CSR Offset: 1
> vme_tsi148 0001:08:0e.0: CR/CSR already enabled
> 
> The first line is not right, as this card is in slot 1 and there is
> only one other card, the I/O card which is not capable of being the
> system controller. I'm also confused by the last line, as this is a
> fresh boot, and U-Boot didn't get a chance to set anything.
> 

It might be worth checking the backplane, I had a feeling that some of
the backplane termination/configuration was quite important to get
things set up right, though I've never seen this issue before. There
might be a way to force a board to be system controller, but the Linux
driver doesn't currently support this IIRC. Just out of interest, what
happens if you only have the processor board in the rack?

Martyn