vme_tsi148 question

Michael Kenney mfkenney at gmail.com
Wed Feb 5 23:21:00 UTC 2014


On Wed, Feb 5, 2014 at 1:38 PM, Michael Kenney <mfkenney at gmail.com> wrote:
> On Wed, Feb 5, 2014 at 1:22 PM, Martyn Welch <martyn at welchs.me.uk> wrote:
>>
>>
>>
>> On 5 February 2014 17:41, Greg KH <gregkh at linuxfoundation.org> wrote:
>>>
>>> On Tue, Feb 04, 2014 at 06:34:13PM +0000, Martyn Welch wrote:
>>> > On 04/02/14 16:34, Michael Kenney wrote:
>>> > >Hi Martyn,
>>> > >
>>> > >On Tue, Feb 4, 2014 at 7:28 AM, Martyn Welch <martyn.welch at ge.com
>>> > ><mailto:martyn.welch at ge.com>> wrote:
>>> > >
>>> > >    On 28/12/13 00:34, Michael Kenney wrote:
>>> > >
>>> > >        Hi Martyn,
>>> > >
>>> > >        On Fri, Dec 27, 2013 at 4:23 PM, Martyn Welch
>>> > >        <martyn at welchs.me.uk <mailto:martyn at welchs.me.uk>> wrote:
>>> > >
>>> > >            On 27/12/13 20:15, Michael Kenney wrote:
>>> > >
>>> > >
>>> > >                We are using the vme_tsi148 bridge driver along with
>>> > > the
>>> > >                vme_user
>>> > >                driver to access the VME boards. The A/D board requires
>>> > >                D32 bus cycles
>>> > >                and the VME master window is configured accordingly,
>>> > >                however, when
>>> > >                monitoring the bus cycles with a logic analyzer, we
>>> > >                noticed that the
>>> > >                CPU is transferring one byte at a time (i.e. four D8
>>> > >                transfers rather
>>> > >                than one D32).
>>> > >
>>> > >                Is this the expected behavior of the tsi148 driver?
>>> > >
>>> > >
>>> > >            Hi Mike,
>>> > >
>>> > >            This is certainly not the expected behaviour - if the
>>> > > window
>>> > >            is configured
>>> > >            for D32 then it should do 32 bit transfers where possible.
>>> > >
>>> > >            I've heard of this happening recently, but haven't yet been
>>> > >            able to
>>> > >            replicate it. Which VME board are you running Linux on and
>>> > >            which flavour of
>>> > >            Linux?
>>> > >
>>> > >
>>> > >        I'm running Debian 7.2 with kernel 3.2 on a Fastwel CPC600
>>> > >        (Pentium M
>>> > >        based CPU board).
>>> > >
>>> > >
>>> > >    I haven't forgotten about this, still not sure exactly what is
>>> > >    happening.
>>> > >
>>> > >    Is your install/kernel 32 or 64 bit?
>>> > >
>>> > >    Are you doing single 32-bit transfers, or are you seeing this on
>>> > >    longer transfers (i.e. copying a buffer full of data)?
>>> > >
>>> > >
>>> > >Thanks for getting back to me.
>>> > >
>>> > >I'm running a 32-bit kernel and I see this behavior on all transfers
>>> > >regardless of buffer size.
>>> > >
>>> >
>>> > Gah! Thought I could see what may be causing it in 64-bit kernels, but
>>> > not
>>> > 32-bit (my x86 asm is not particularly hot).
>>> >
>>> > I think we /may/ be hitting issues with how the memcpy function gets
>>> > implemented on specific architectures. The tsi148 is a PCI/X to VME
>>> > bridge,
>>> > if it receives a series of 8-bit reads or writes, it translates these to
>>> > 8-bit reads or writes on the VME bus, which is not necessarily what we
>>> > want.
>>> >
>>> > According to the data sheet, the card you are using has an Intel 6300ESB
>>> > ICH, which seems to be PCI/X (so we can rule out PCIe to PCI/X bridges
>>> > or
>>> > something like that doing nasty things).
>>> >
>>> > I think (if I follow everything correctly) then the memcpy for 32-bit is
>>> > handled by __memcpy in arch/x86/include/asm/string_32.h:
>>> >
>>> > static __always_inline void *__memcpy(void *to, const void *from, size_t
>>> > n)
>>> > {
>>> >         int d0, d1, d2;
>>> >         asm volatile("rep ; movsl\n\t"
>>> >                      "movl %4,%%ecx\n\t"
>>> >                      "andl $3,%%ecx\n\t"
>>> >                      "jz 1f\n\t"
>>> >                      "rep ; movsb\n\t"
>>> >                      "1:"
>>> >                      : "=&c" (d0), "=&D" (d1), "=&S" (d2)
>>> >                      : "0" (n / 4), "g" (n), "1" ((long)to), "2"
>>> > ((long)from)
>>> >                      : "memory");
>>> >         return to;
>>> > }
>>> >
>>> > I'd expected this function to use movl (32-bit moves) where possible,
>>> > but
>>> > movsb to get to naturally aligned moves (which is something that we deal
>>> > with in the VME code already to use 16-bit reads where we can).
>>> >
>>> > Greg (as co-maintainer of the VME subsystem :-) ), Am I reading this
>>> > right?
>>>
>>> Yes, but why would it matter if we copy by bytes, bits, or dwords to a
>>> memory-backed bus?  By definition, that shouldn't matter.  If it does
>>> matter, then we shouldn't be using memcpy, or relying on the compiler to
>>> do the copying, but rather we need to use a write() function instead as
>>> this would be IO memory that does care about this type of thing.
>>>
>>
>> The drivers are using the functions memcpy_toio and memcpy_fromio for the
>> bulk of the transfer, these seem to equate to memcpy on x86.
>>
>> In this instance it does matter - some devices can only accept certain width
>> accesses and the VME bridge will translate a 8-bit PCI transfer to a 8-bit
>> transfer on the VME bus.
>>
>> So (and I haven't even compile tested this yet), this?:
>>
>> diff --git a/drivers/vme/bridges/vme_ca91cx42.c
>> b/drivers/vme/bridges/vme_ca91cx42.c
>> index 1425d22c..0d87ebd 100644
>> --- a/drivers/vme/bridges/vme_ca91cx42.c
>> +++ b/drivers/vme/bridges/vme_ca91cx42.c
>> @@ -894,9 +894,9 @@ static ssize_t ca91cx42_master_read(struct
>> vme_master_resource *image,
>>         }
>>
>>         count32 = (count - done) & ~0x3;
>> -       if (count32 > 0) {
>> -               memcpy_fromio(buf + done, addr + done, (unsigned int)count);
>> -               done += count32;
>> +       while (done < count32) {
>> +               *(u32 *)(buf + done) = ioread32(addr + done);
>> +               done += 4;
>>         }
>>
>>         if ((count - done) & 0x2) {
>> @@ -948,9 +948,9 @@ static ssize_t ca91cx42_master_write(struct
>> vme_master_resource *image,
>>         }
>>
>>         count32 = (count - done) & ~0x3;
>> -       if (count32 > 0) {
>> -               memcpy_toio(addr + done, buf + done, count32);
>> -               done += count32;
>> +       while (done < count32) {
>> +               iowrite32(*(u32 *)(buf + done), addr + done);
>> +               done += 4;
>>         }
>>
>>         if ((count - done) & 0x2) {
>> diff --git a/drivers/vme/bridges/vme_tsi148.c
>> b/drivers/vme/bridges/vme_tsi148.c
>> index 5fbd08f..bc82991 100644
>> --- a/drivers/vme/bridges/vme_tsi148.c
>> +++ b/drivers/vme/bridges/vme_tsi148.c
>> @@ -1297,9 +1297,9 @@ static ssize_t tsi148_master_read(struct
>> vme_master_resource *image, void *buf,
>>         }
>>
>>         count32 = (count - done) & ~0x3;
>> -       if (count32 > 0) {
>> -               memcpy_fromio(buf + done, addr + done, count32);
>> -               done += count32;
>> +       if (done < count32) {
>> +               *(u32 *)(buf + done) = ioread32(addr + done);
>> +               done += 4;
>>         }
>>
>>         if ((count - done) & 0x2) {
>> @@ -1379,9 +1379,9 @@ static ssize_t tsi148_master_write(struct
>> vme_master_resource *image, void *buf,
>>         }
>>
>>         count32 = (count - done) & ~0x3;
>> -       if (count32 > 0) {
>> -               memcpy_toio(addr + done, buf + done, count32);
>> -               done += count32;
>> +       if (done < count32) {
>> +               iowrite32(*(u32 *)(buf + done), addr + done);
>> +               done += 4;
>>         }
>>
>>         if ((count - done) & 0x2) {
>>
>>
>>>
>>> Does that help?
>>>
>>
>> Yep :-)
>>
>> Thanks Greg.
>
> That certainly looks promising. I can give that patch a try later this week.
>
> Thanks.

Good news Martyn.

I had a bit of time today to test your patch and am happy to say it
worked! No more bus errors.

Now I can work on getting the actual data acquisition working :-)

Thanks again for all your help.

--Mike


More information about the devel mailing list