vme_tsi148 question

Michael Kenney mfkenney at gmail.com
Wed Feb 5 21:38:04 UTC 2014


On Wed, Feb 5, 2014 at 1:22 PM, Martyn Welch <martyn at welchs.me.uk> wrote:
>
>
>
> On 5 February 2014 17:41, Greg KH <gregkh at linuxfoundation.org> wrote:
>>
>> On Tue, Feb 04, 2014 at 06:34:13PM +0000, Martyn Welch wrote:
>> > On 04/02/14 16:34, Michael Kenney wrote:
>> > >Hi Martyn,
>> > >
>> > >On Tue, Feb 4, 2014 at 7:28 AM, Martyn Welch <martyn.welch at ge.com
>> > ><mailto:martyn.welch at ge.com>> wrote:
>> > >
>> > >    On 28/12/13 00:34, Michael Kenney wrote:
>> > >
>> > >        Hi Martyn,
>> > >
>> > >        On Fri, Dec 27, 2013 at 4:23 PM, Martyn Welch
>> > >        <martyn at welchs.me.uk <mailto:martyn at welchs.me.uk>> wrote:
>> > >
>> > >            On 27/12/13 20:15, Michael Kenney wrote:
>> > >
>> > >
>> > >                We are using the vme_tsi148 bridge driver along with
>> > > the
>> > >                vme_user
>> > >                driver to access the VME boards. The A/D board requires
>> > >                D32 bus cycles
>> > >                and the VME master window is configured accordingly,
>> > >                however, when
>> > >                monitoring the bus cycles with a logic analyzer, we
>> > >                noticed that the
>> > >                CPU is transferring one byte at a time (i.e. four D8
>> > >                transfers rather
>> > >                than one D32).
>> > >
>> > >                Is this the expected behavior of the tsi148 driver?
>> > >
>> > >
>> > >            Hi Mike,
>> > >
>> > >            This is certainly not the expected behaviour - if the
>> > > window
>> > >            is configured
>> > >            for D32 then it should do 32 bit transfers where possible.
>> > >
>> > >            I've heard of this happening recently, but haven't yet been
>> > >            able to
>> > >            replicate it. Which VME board are you running Linux on and
>> > >            which flavour of
>> > >            Linux?
>> > >
>> > >
>> > >        I'm running Debian 7.2 with kernel 3.2 on a Fastwel CPC600
>> > >        (Pentium M
>> > >        based CPU board).
>> > >
>> > >
>> > >    I haven't forgotten about this, still not sure exactly what is
>> > >    happening.
>> > >
>> > >    Is your install/kernel 32 or 64 bit?
>> > >
>> > >    Are you doing single 32-bit transfers, or are you seeing this on
>> > >    longer transfers (i.e. copying a buffer full of data)?
>> > >
>> > >
>> > >Thanks for getting back to me.
>> > >
>> > >I'm running a 32-bit kernel and I see this behavior on all transfers
>> > >regardless of buffer size.
>> > >
>> >
>> > Gah! Thought I could see what may be causing it in 64-bit kernels, but
>> > not
>> > 32-bit (my x86 asm is not particularly hot).
>> >
>> > I think we /may/ be hitting issues with how the memcpy function gets
>> > implemented on specific architectures. The tsi148 is a PCI/X to VME
>> > bridge,
>> > if it receives a series of 8-bit reads or writes, it translates these to
>> > 8-bit reads or writes on the VME bus, which is not necessarily what we
>> > want.
>> >
>> > According to the data sheet, the card you are using has an Intel 6300ESB
>> > ICH, which seems to be PCI/X (so we can rule out PCIe to PCI/X bridges
>> > or
>> > something like that doing nasty things).
>> >
>> > I think (if I follow everything correctly) then the memcpy for 32-bit is
>> > handled by __memcpy in arch/x86/include/asm/string_32.h:
>> >
>> > static __always_inline void *__memcpy(void *to, const void *from, size_t
>> > n)
>> > {
>> >         int d0, d1, d2;
>> >         asm volatile("rep ; movsl\n\t"
>> >                      "movl %4,%%ecx\n\t"
>> >                      "andl $3,%%ecx\n\t"
>> >                      "jz 1f\n\t"
>> >                      "rep ; movsb\n\t"
>> >                      "1:"
>> >                      : "=&c" (d0), "=&D" (d1), "=&S" (d2)
>> >                      : "0" (n / 4), "g" (n), "1" ((long)to), "2"
>> > ((long)from)
>> >                      : "memory");
>> >         return to;
>> > }
>> >
>> > I'd expected this function to use movl (32-bit moves) where possible,
>> > but
>> > movsb to get to naturally aligned moves (which is something that we deal
>> > with in the VME code already to use 16-bit reads where we can).
>> >
>> > Greg (as co-maintainer of the VME subsystem :-) ), Am I reading this
>> > right?
>>
>> Yes, but why would it matter if we copy by bytes, bits, or dwords to a
>> memory-backed bus?  By definition, that shouldn't matter.  If it does
>> matter, then we shouldn't be using memcpy, or relying on the compiler to
>> do the copying, but rather we need to use a write() function instead as
>> this would be IO memory that does care about this type of thing.
>>
>
> The drivers are using the functions memcpy_toio and memcpy_fromio for the
> bulk of the transfer, these seem to equate to memcpy on x86.
>
> In this instance it does matter - some devices can only accept certain width
> accesses and the VME bridge will translate a 8-bit PCI transfer to a 8-bit
> transfer on the VME bus.
>
> So (and I haven't even compile tested this yet), this?:
>
> diff --git a/drivers/vme/bridges/vme_ca91cx42.c
> b/drivers/vme/bridges/vme_ca91cx42.c
> index 1425d22c..0d87ebd 100644
> --- a/drivers/vme/bridges/vme_ca91cx42.c
> +++ b/drivers/vme/bridges/vme_ca91cx42.c
> @@ -894,9 +894,9 @@ static ssize_t ca91cx42_master_read(struct
> vme_master_resource *image,
>         }
>
>         count32 = (count - done) & ~0x3;
> -       if (count32 > 0) {
> -               memcpy_fromio(buf + done, addr + done, (unsigned int)count);
> -               done += count32;
> +       while (done < count32) {
> +               *(u32 *)(buf + done) = ioread32(addr + done);
> +               done += 4;
>         }
>
>         if ((count - done) & 0x2) {
> @@ -948,9 +948,9 @@ static ssize_t ca91cx42_master_write(struct
> vme_master_resource *image,
>         }
>
>         count32 = (count - done) & ~0x3;
> -       if (count32 > 0) {
> -               memcpy_toio(addr + done, buf + done, count32);
> -               done += count32;
> +       while (done < count32) {
> +               iowrite32(*(u32 *)(buf + done), addr + done);
> +               done += 4;
>         }
>
>         if ((count - done) & 0x2) {
> diff --git a/drivers/vme/bridges/vme_tsi148.c
> b/drivers/vme/bridges/vme_tsi148.c
> index 5fbd08f..bc82991 100644
> --- a/drivers/vme/bridges/vme_tsi148.c
> +++ b/drivers/vme/bridges/vme_tsi148.c
> @@ -1297,9 +1297,9 @@ static ssize_t tsi148_master_read(struct
> vme_master_resource *image, void *buf,
>         }
>
>         count32 = (count - done) & ~0x3;
> -       if (count32 > 0) {
> -               memcpy_fromio(buf + done, addr + done, count32);
> -               done += count32;
> +       if (done < count32) {
> +               *(u32 *)(buf + done) = ioread32(addr + done);
> +               done += 4;
>         }
>
>         if ((count - done) & 0x2) {
> @@ -1379,9 +1379,9 @@ static ssize_t tsi148_master_write(struct
> vme_master_resource *image, void *buf,
>         }
>
>         count32 = (count - done) & ~0x3;
> -       if (count32 > 0) {
> -               memcpy_toio(addr + done, buf + done, count32);
> -               done += count32;
> +       if (done < count32) {
> +               iowrite32(*(u32 *)(buf + done), addr + done);
> +               done += 4;
>         }
>
>         if ((count - done) & 0x2) {
>
>
>>
>> Does that help?
>>
>
> Yep :-)
>
> Thanks Greg.

That certainly looks promising. I can give that patch a try later this week.

Thanks.

--Mike


More information about the devel mailing list