Panic in drm_calc_timestamping_constants in staging-next

Ville Syrjälä ville.syrjala at linux.intel.com
Mon Nov 16 13:13:12 UTC 2015


On Sun, Nov 15, 2015 at 01:17:00PM -0500, ira.weiny wrote:
> With the latest staging-testing and staging-next[*] I am getting the following panic.
> 
> [*] git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
> 
> 
> [   11.232549] BUG: unable to handle kernel NULL pointer dereference at
> 00000000000000b0
> [   11.232568] IP: [<ffffffffa0103206>]
> drm_calc_timestamping_constants+0x86/0x130 [drm]

http://lists.freedesktop.org/archives/dri-devel/2015-November/094298.html

> [   11.232571] PGD 0 
> [   11.232574] Oops: 0002 [#1] SMP 
> [   11.232595] Modules linked in: ib_qib mgag200(+) drm_kms_helper isci
> syscopyarea sysfillrect sysimgblt fb_sys_fops ib_mad ttm libsas mlx4_core(+)
> ib_core igb drm ahci scsi_transport_sas libahci ptp libata firewire_ohci
> ib_addr pps_core firewire_core dca i2c_algo_bit i2c_core crc_itu_t
> [   11.232600] CPU: 13 PID: 497 Comm: systemd-udevd Not tainted 4.3.0+ #1
> [   11.232601] Hardware name: Intel Corporation W2600CR ........../W2600CR,
> BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
> [   11.232603] task: ffff8800343abfc0 ti: ffff8804244a8000 task.ti:
> ffff8804244a8000
> [   11.232618] RIP: 0010:[<ffffffffa0103206>]  [<ffffffffa0103206>]
> drm_calc_timestamping_constants+0x86/0x130 [drm]
> [   11.232620] RSP: 0018:ffff8804244ab118  EFLAGS: 00010246
> [   11.232621] RAX: 0000000000fe4c00 RBX: ffff880424b10160 RCX:
> 0000000000000540
> [   11.232623] RDX: 0000000000000000 RSI: 000000000000fde8 RDI:
> ffff880424b10000
> [   11.232624] RBP: ffff8804244ab148 R08: ffff8804244a8000 R09:
> 000000029d828339
> [   11.232626] R10: 00000000000050c4 R11: 0000000000000000 R12:
> 0000000000fe4c00
> [   11.232627] R13: ffff880424b10000 R14: 0000000000000000 R15:
> 000000000000fde8
> [   11.232629] FS:  00007fecf960d880(0000) GS:ffff88082d940000(0000)
> knlGS:0000000000000000
> [   11.232631] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   11.232632] CR2: 00000000000000b0 CR3: 0000000424493000 CR4:
> 00000000000406e0
> [   11.232634] Stack:
> [   11.232637]  ffff8804244ab148 ffff880424b10000 ffff88042a86bb40
> ffff88042a86b800
> [   11.232639]  ffff88042a86bb48 ffff88042a86bb40 ffff8804244ab378
> ffffffffa030c7e7
> [   11.232642]  ffff880424b10090 0000000000000000 ffff880424b10160
> 0000000000000000
> [   11.232642] Call Trace:
> [   11.232655]  [<ffffffffa030c7e7>] drm_crtc_helper_set_mode+0x3d7/0x4b0
> [drm_kms_helper]
> [   11.232665]  [<ffffffffa030d7d4>] drm_crtc_helper_set_config+0x8d4/0xb10
> [drm_kms_helper]
> [   11.232683]  [<ffffffffa010c874>] drm_mode_set_config_internal+0x64/0x100
> [drm]
> [   11.232694]  [<ffffffffa0319352>] drm_fb_helper_pan_display+0xa2/0x280
> [drm_kms_helper]
> [   11.232703]  [<ffffffff81395a8b>] fb_pan_display+0xbb/0x170
> [   11.232708]  [<ffffffff8138fd80>] bit_update_start+0x20/0x50
> [   11.232712]  [<ffffffff8138e62b>] fbcon_switch+0x39b/0x590
> [   11.232721]  [<ffffffff8140d260>] redraw_screen+0x1a0/0x240
> [   11.232725]  [<ffffffff8140dc28>] vc_do_resize+0x4d8/0x500
> [   11.232729]  [<ffffffff8140dc6f>] vc_resize+0x1f/0x30
> [   11.232732]  [<ffffffff8138ec32>] fbcon_init+0x342/0x530
> [   11.232737]  [<ffffffff8140b8ea>] visual_init+0xca/0x130
> [   11.232741]  [<ffffffff8140dff6>] do_bind_con_driver+0x146/0x310
> [   11.232746]  [<ffffffff8140e4e1>] do_take_over_console+0x141/0x1b0
> [   11.232750]  [<ffffffff8138a187>] do_fbcon_takeover+0x57/0xb0
> [   11.232754]  [<ffffffff8138f79b>] fbcon_event_notify+0x60b/0x750
> [   11.232760]  [<ffffffff810a5889>] notifier_call_chain+0x49/0x70
> [   11.232764]  [<ffffffff810a5bcd>] __blocking_notifier_call_chain+0x4d/0x70
> [   11.232768]  [<ffffffff810a5c06>] blocking_notifier_call_chain+0x16/0x20
> [   11.232772]  [<ffffffff8139563b>] fb_notifier_call_chain+0x1b/0x20
> [   11.232775]  [<ffffffff81397691>] register_framebuffer+0x1f1/0x330
> [   11.232784]  [<ffffffffa031a9ba>] drm_fb_helper_initial_config+0x27a/0x3d0
> [drm_kms_helper]
> [   11.232792]  [<ffffffffa0341b4d>] mgag200_fbdev_init+0xdd/0xf0 [mgag200]
> [   11.232798]  [<ffffffffa0340586>] mgag200_modeset_init+0x176/0x1e0 [mgag200]
> [   11.232804]  [<ffffffffa033c659>] mgag200_driver_load+0x3f9/0x580 [mgag200]
> [   11.232819]  [<ffffffffa0106007>] drm_dev_register+0xa7/0xb0 [drm]
> [   11.232834]  [<ffffffffa01084ef>] drm_get_pci_dev+0x8f/0x1e0 [drm]
> [   11.232840]  [<ffffffffa034137b>] mga_pci_probe+0x9b/0xc0 [mgag200]
> [   11.232848]  [<ffffffff813690f5>] local_pci_probe+0x45/0xa0
> [   11.232853]  [<ffffffff8136a53c>] pci_device_probe+0xfc/0x140
> [   11.232858]  [<ffffffff8145566b>] driver_probe_device+0x21b/0x460
> [   11.232861]  [<ffffffff81455935>] __driver_attach+0x85/0x90
> [   11.232864]  [<ffffffff814558b0>] ? driver_probe_device+0x460/0x460
> [   11.232868]  [<ffffffff8145337c>] bus_for_each_dev+0x6c/0xc0
> [   11.232871]  [<ffffffff81454fce>] driver_attach+0x1e/0x20
> [   11.232873]  [<ffffffff81454ae0>] bus_add_driver+0x1d0/0x290
> [   11.232876]  [<ffffffff814562e0>] driver_register+0x60/0xe0
> [   11.232880]  [<ffffffff81368a9c>] __pci_register_driver+0x4c/0x50
> [   11.232894]  [<ffffffffa0108720>] drm_pci_init+0xe0/0x110 [drm]
> [   11.232897]  [<ffffffffa0348000>] ? 0xffffffffa0348000
> [   11.232902]  [<ffffffffa0348032>] mgag200_init+0x32/0x1000 [mgag200]
> [   11.232907]  [<ffffffff8100213d>] do_one_initcall+0xcd/0x1f0
> [   11.232911]  [<ffffffff811c5e56>] ? __vunmap+0xa6/0xf0
> [   11.232918]  [<ffffffff811e2c1b>] ? kmem_cache_alloc_trace+0x17b/0x1e0
> [   11.232921]  [<ffffffff81185243>] ? do_init_module+0x27/0x1e8
> [   11.232924]  [<ffffffff8118527c>] do_init_module+0x60/0x1e8
> [   11.232930]  [<ffffffff8110a6e3>] load_module+0x12b3/0x1980
> [   11.232933]  [<ffffffff81106b10>] ? __symbol_put+0x60/0x60
> [   11.232938]  [<ffffffff81106f80>] ? copy_module_from_fd.isra.51+0x110/0x160
> [   11.232943]  [<ffffffff8110afbf>] SyS_finit_module+0x9f/0xd0
> [   11.232949]  [<ffffffff8169146e>] entry_SYSCALL_64_fastpath+0x12/0x71
> [   11.232976] Code: f6 31 d2 41 89 c2 8b 83 b4 00 00 00 0f af c1 48 98 48 69
> c0 40 42 0f 00 48 f7 f6 f6 43 74 10 41 89 c4 75 26 f6 05 fa 6f 03 00 01 <45> 89
> 96 b0 00 00 00 45 89 a6 ac 00 00 00 75 35 48 83 c4 08 5b 
> [   11.232990] RIP  [<ffffffffa0103206>]
> drm_calc_timestamping_constants+0x86/0x130 [drm]
> [   11.232991]  RSP <ffff8804244ab118>
> [   11.232992] CR2: 00000000000000b0
> [   11.232996] ---[ end trace 402fdf8659b2f760 ]---
> [   11.238445] Kernel panic - not syncing: Fatal exception
> [   11.238510] Kernel Offset: disabled
> 
> 
> I believe it is related to (but not directly caused by) this commit:
> 
> 
> commit eba1f35dfe145247c7eb690c7c32740fde8ec699
> Author: Ville Syrjälä <ville.syrjala at linux.intel.com>
> Date:   Mon Sep 14 22:43:43 2015 +0300
> 
>     drm: Move timestamping constants into drm_vblank_crtc
>     
>     Collect the timestamping constants alongside the rest of the relevant
>     stuff under drm_vblank_crtc.
>     
>     We can now get rid of the 'refcrtc' parameter to
>     drm_calc_vbltimestamp_from_scanoutpos().
>     
>     Signed-off-by: Ville Syrjälä <ville.syrjala at linux.intel.com>
>     Reviewed-by: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
>     Signed-off-by: Daniel Vetter <daniel.vetter at ffwll.ch>
> 
> 
> 
> The reason I think it is not caused by the above commit is that when I run with
> this commit I get a __hang__ rather than a panic.  But running with the parent
> commit (below) works just fine:
> 
> commit 942840371cde152fe57c15e0e8483b760e7763e3
> Author: Matt Roper <matthew.d.roper at intel.com>
> Date:   Mon Sep 21 17:21:48 2015 -0700
> 
>     drm/fbdev: Update legacy plane->fb refcounting for atomic restore
>     
>     Starting with commit
>     
>             commit 28cc504e8d52248962f5b485bdc65f539e3fe21d
>             Author: Rob Clark <robdclark at gmail.com>
>             Date:   Tue Aug 25 15:36:00 2015 -0400
>     
>                 drm/i915: enable atomic fb-helper
>     
>     I've been seeing some panics on i915 when the DRM master shuts down that
> appear
>     to be caused by using an already-freed framebuffer (i.e., we're
> unexpectedly
>     dropping our initial FB's reference count to 0 and freeing it, which causes
> a
>     crash when we try to restore it later).  Digging deeper, the state FB
>     refcounting is working as expected, but we seem to be missing proper
>     refcounting on the legacy plane->fb pointers in the new atomic fbdev code.
>     
>     Tracking plane->old_fb and then doing a ref/unref at the end of the
>     fbdev restore like we do in the legacy ioctl's ensures we don't miscount
>     references on plane->fb and avoids the panics.
>     
>     v2 from Daniel:
>     
>     Really do what the atomic ioctl does:
>     - Also update plane->fb and plane->crtc.
>           - Clear out plane->old_fb on failures too.
>                 
>     v3: git add everything. Oops.
>     
>     v4: Also clear old_fb in all other failure paths, spotted by David.
>     
>     Cc: Rob Clark <robdclark at gmail.com>
>     Cc: intel-gfx at lists.freedesktop.org
>     Cc: David Herrmann <dh.herrmann at gmail.com>
>     Cc: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
>     Signed-off-by: Matt Roper <matthew.d.roper at intel.com> (v1)
>     Reviewd-by: David Herrmann <dh.herrmann at gmail.com>
>     Signed-off-by: Daniel Vetter <daniel.vetter at ffwll.ch>
> 
> 
> Because I am getting a hang I'm not quite sure where to proceed with bisect
> beyond the commit in question.
> 
> A bit of digging reveals that it may be that vblank has not been allocated at
> all. Using the following hack:
> 
>         struct drm_vblank_crtc *vblank = &crtc->dev->vblank[drm_crtc_index(crtc)];
> 
> 
> 12:46:25 > git di
> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
> index eba6337f5860..649c32c00b36 100644
> --- a/drivers/gpu/drm/drm_irq.c
> +++ b/drivers/gpu/drm/drm_irq.c
> @@ -641,8 +641,13 @@ void drm_calc_timestamping_constants(struct drm_crtc
> *crtc,
>                 DRM_ERROR("crtc %u: Can't calculate constants, dotclock =
> 0!\n",
>                           crtc->base.id);
>  
> -       vblank->linedur_ns  = linedur_ns;
> -       vblank->framedur_ns = framedur_ns;
> +       if ((u64)vblank < 1000) {
> +               DRM_ERROR("crtc %u: Can't calculate linedur_ns or framedur_ns; vblank %p; drm_crtc_index(crtc) %d\n",
> +                         crtc->base.id, vblank, drm_crtc_index(crtc));
> +       } else {
> +               vblank->linedur_ns  = linedur_ns;
> +               vblank->framedur_ns = framedur_ns;
> +       }
>  
>         DRM_DEBUG("crtc %u: hwmode: htotal %d, vtotal %d, vdisplay %d\n",
>                   crtc->base.id, mode->crtc_htotal,
> 
> 
> I got the following output.
> 
> kernel: [drm:drm_calc_timestamping_constants [drm]]
> *ERROR* crtc 19: Can't calculate linedur_ns or framedur_ns; vblank (null); drm_crtc_index(crtc) 0
> 
> So this must mean that the vblank array is not allocated yet?
> 
> What intervening patch between 4.3 and the current staging-next might change
> where/how vblank is allocated?
> 
> Thanks,
> Ira

-- 
Ville Syrjälä
Intel OTC


More information about the devel mailing list