Age | Commit message (Collapse) | Author | Files | Lines |
|
vb2_dma_contig_init_ctx returns an error if failed, NULL check is not necessary.
Signed-off-by: Kamil Debski <k.debski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Move firmware allocation from open to probe to avoid problems
when using CMA for allocation. In certain circumstances CMA may allocate
buffer that is not in the beginning of the MFC memory area.
Signed-off-by: Kamil Debski <k.debski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Checking unsigned variable for negative value always returns false.
Hence make this value signed as we expect it to be negative too.
Fixes the following smatch warning:
drivers/media/platform/s5p-mfc/s5p_mfc_opr_v6.c:572
s5p_mfc_set_enc_ref_buffer_v6() warn: unsigned 'buf_size1' is never
less than zero.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Fixed a trivial typo.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
The code returns before this statement. Hence not required.
Silences the following smatch message:
drivers/media/platform/s5p-mfc/s5p_mfc_opr_v5.c:525
s5p_mfc_set_dec_frame_buffer_v5() info: ignoring unreachable code.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Flushing of delay DPB buffers have to be done during stream off.
In MFC v6, it is done with a risc to host command.
Signed-off-by: Arun Kumar K <arun.kk@samsung.com>
Signed-off-by: Arun Mankuzhi <arun.m@samsung.com>
Acked-by: Kamil Debski <k.debski@samsung.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
The variable index is initialized but never used
otherwise, so remove the unused variable.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
This patch will add the device tree support for MFC driver.
Signed-off-by: Arun Kumar K <arun.kk@samsung.com>
Acked-by: Kamil Debski <k.debski@samsung.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
This patch adds the dmabuf export buffer feature to the
Exynos G-Scaler driver.
Signed-off-by: Shaik Ameer Basha <shaik.ameer@samsung.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Replace IS_ERR_OR_NULL with IS_ERR on clk_get results.
Signed-off-by: Tony Prisk <linux@prisktech.co.nz>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Fixes the following checkpatch warnings:
WARNING: sizeof *fmt should be sizeof(*fmt)
WARNING: sizeof *res should be sizeof(*res)
WARNING: sizeof *res should be sizeof(*res)
WARNING: sizeof sd->name should be sizeof(sd->name)
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Fixes the following checkpatch warning:
WARNING: sizeof *ctx should be sizeof(*ctx)
FILE: media/platform/s5p-tv/hdmiphy_drv.c:287:
ctx = kzalloc(sizeof *ctx, GFP_KERNEL);
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Silences checkpatch warnings of type:
WARNING: sizeof mdev->res should be sizeof(mdev->res)
FILE: media/platform/s5p-tv/mixer_drv.c:301:
memset(&mdev->res, 0, sizeof mdev->res);
WARNING: sizeof *mdev should be sizeof(*mdev)
FILE: media/platform/s5p-tv/mixer_drv.c:385:
mdev = kzalloc(sizeof *mdev, GFP_KERNEL);
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Silences checkpatch warnings of the type:
WARNING: sizeof filter_y_horiz_tap8 should be sizeof(filter_y_horiz_tap8)
FILE: media/platform/s5p-tv/mixer_reg.c:473:
filter_y_horiz_tap8, sizeof filter_y_horiz_tap8);
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Silences several checkpatch warnings of the type:
WARNING: sizeof *out should be sizeof(*out)
FILE: media/platform/s5p-tv/mixer_video.c:98:
out = kzalloc(sizeof *out, GFP_KERNEL);
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Silences the following checkpatch warnings:
WARNING: sizeof *sdev should be sizeof(*sdev)
FILE: media/platform/s5p-tv/sdo_drv.c:304:
sdev = devm_kzalloc(&pdev->dev, sizeof *sdev, GFP_KERNEL);
WARNING: sizeof sdev->sd.name should be sizeof(sdev->sd.name)
FILE: media/platform/s5p-tv/sdo_drv.c:394:
strlcpy(sdev->sd.name, "s5p-sdo", sizeof sdev->sd.name);
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Replace IS_ERR_OR_NULL with IS_ERR on clk_get results.
Signed-off-by: Tony Prisk <linux@prisktech.co.nz>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Use proper return value test for clk_get() and devm_regulator_get()
functions and propagate any errors from the clock and the regulator
subsystem to the driver core. In two cases a proper error code is
now returned rather than 0.
Reported-by: Peter Senna Tschudin <peter.senna@gmail.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
This patch patch eliminates potential AB-BA deadlock when one process calls
open(), or VIDIOC_S/TRY_FMT ioctl on the FIMC capture video node, while
other thread is reconfiguring media links via media device node:
/dev/video? open() /dev/media? MEDIA_IOC_SETUP_LINK ioctl
mutex_lock(video_lock) mutex_lock(graph_lock)
fimc_pipeline_open() fimc_md_link_notify()
mutex_lock(graph_lock) mutex_lock(video_lock)
... ...
The deadlock is avoided by always taking the graph mutex first in video
node open() or an ioctl, before the video lock is acquired. Reversed
order seems impossible, since media device driver's link_notify callback
is called with media graph mutex already held.
To ensure proper locking order VIDIOC_S_FMT and VIDIOC_TRY_FMT ioctls are
not serialized in the v4l2-core and the driver takes care of it itself.
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Replace IS_ERR_OR_NULL with IS_ERR on clk_get results.
Signed-off-by: Tony Prisk <linux@prisktech.co.nz>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Replace the hard coded csi_sensors[] array size with a relevant
constant to make sure we don't iterate beyond the actual array.
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
This fixes following issue found with a static analysis tool:
Pointer 'ffmt' returned from call to function 'fimc_capture_try_format'
at line 1522 may be NULL and may be dereferenced at line 1535.
Although it shouldn't happen in practice, add the NULL pointer check
to be on the safe side.
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Converting it to platform code can make the code smaller.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
After removing i.mx25 support and buf_cleanup() callback,
buffer states are not used in the code any longer.
Signed-off-by: Javier Martin <javier.martin@vista-silicon.com>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
All necessary tasks to end the streaming properly are
already implemented in mx2_stop_streaming() and nothing
remains to be done in this callback.
Furthermore, it only included debug messages so it can
be removed.
Signed-off-by: Javier Martin <javier.martin@vista-silicon.com>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
i.MX25 support has been broken for several releases
now and nobody seems to care about it.
Signed-off-by: Javier Martin <javier.martin@vista-silicon.com>
[g.liakhovetski@gmx.de: rebased on top of cpu_is_mx27() removal]
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
I2C drivers can use devm_kzalloc() too in their .probe() methods. Doing so
simplifies their clean up paths.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
struct soc_camera_link currently contains fields, used both by sensor and
bridge drivers. To make subdevice driver re-use simpler, split it into a
host and a subdevice parts.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Currently soc-camera has a per-device node lock, used for video operations
and a per-host lock for code paths, modifying host's pipeline. Manipulating
the two locks increases complexity and doesn't bring any advantages. This
patch removes the per-device lock and uses the per-host lock for all
operations.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Currently devm_regulator_bulk_get() is called by soc-camera during host
driver probing, but regulators are attached to the camera platform
device, that is staying, independent whether the host probed successfully
or not. This can lead to repeated regulator requesting, if the host
driver is re-probed. Move the call to platform device probing to avoid
this.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
The recently introduced host_lock causes lockdep warnings, besides, list
enumeration in scan_add_host() must be protected by holdint the list_lock.
OTOH, holding .video_lock in soc_camera_open() isn't enough to protect
the host during its building of the pipeline, because .video_lock is per
soc-camera device. If, e.g. more than one sensor can be attached to a host
and the user tries to open both device nodes simultaneously, host's .add()
method can be called simultaneously for both sensors. Fix these problems
by holding list_lock instead of .host_lock in scan_add_host() and taking
it shortly at the beginning of soc_camera_open(), and using .host_lock to
protect host's .add() and .remove() operations only.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Most of *_ops and other structures in vivi.c were already declared const
but some have not. Constify and code/data will take less space:
$ size drivers/media/platform/vivi.o
text data bss dec hex filename
before: 12569 248 8 12825 3219 drivers/media/platform/vivi.o
after: 12308 20 8 12336 3030 drivers/media/platform/vivi.o
i.e. vivi.o is now ~500 bytes less.
Signed-off-by: Kirill Smelkov <kirr@navytux.spb.ru>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Select is evil as it has issues with dependencies. Better to convert
it to use depends on.
That fixes a breakage with out-of-tree compilation of the media
tree.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
versioncheck script complains about missing linux/version.h header
file.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
There is no point in PTR_ERR()ing a NULL pointer, use a real error
instead.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
drivers/media/platform/omap3isp/ispqueue.c:399:18: warning: 'pa' may be
used uninitialized in this function [-Wuninitialized]
This is a false positive but the compiler has no way to know about it,
so initialize the variable to 0.
drivers/media/platform/omap3isp/ispqueue.c:445:6: warning:
'vm_page_prot' may be used uninitialized in this function
[-Wuninitialized]
This is a false positive and the compiler should know better. Use
uninitialized_var().
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
This patch replaces the global frame stats variables by using
internal variables in mcam_camera structure.
Signed-off-by: Albert Wang <twang13@marvell.com>
Signed-off-by: Libin Yang <lbyang@marvell.com>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
drivers/media/platform/omap3isp/ispcsiphy.c: In function
‘csiphy_routing_cfg’:
drivers/media/platform/omap3isp/ispcsiphy.c:71:57: warning: ‘shift’
may be used uninitialized in this function [-Wuninitialized]
drivers/media/platform/omap3isp/ispcsiphy.c:40:6: note: ‘shift’ was
declared here
The warning is a false positive but the compiler is right in
complaining. Fix it by using the correct enum data type for the iface
argument and adding a default case in the switch statement.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Sakari Ailus <sakari.ailus@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
spinlock_t should always be used.
Could not get this to build with allmodconfig:
mcgrof@frijol ~/linux-next (git::(no branch))$ make C=1 M=drivers/media/platform/s5p-jpeg
WARNING: Symbol version dump /home/mcgrof/linux-next/Module.symvers
is missing; modules will have no dependencies and modversions.
Building modules, stage 2.
MODPOST 0 modules
Reported-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Luis R. Rodriguez <mcgrof@do-not-panic.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
spinlock_t should always be used.
Could not get this to build with allmodconfig:
mcgrof@frijol ~/linux-next (git::(no branch))$ make C=1 M=drivers/media/platform/s5p-fimc/
WARNING: Symbol version dump /home/mcgrof/linux-next/Module.symvers
is missing; modules will have no dependencies and modversions.
LD drivers/media/platform/s5p-fimc/built-in.o
Building modules, stage 2.
MODPOST 0 modules
Reported-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Luis R. Rodriguez <mcgrof@do-not-panic.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Resolves the following warning that appears with allmodconfig on -arm:
warning: (VIDEO_OMAP2_VOUT && DRM_OMAP) selects OMAP2_DSS which has unmet direct dependencies (HAS_IOMEM && ARCH_OMAP2PLUS)
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Linux 3.8-rc1
* tag 'v3.8-rc1': (10696 commits)
Linux 3.8-rc1
Revert "nfsd: warn on odd reply state in nfsd_vfs_read"
ARM: dts: fix duplicated build target and alphabetical sort out for exynos
dm stripe: add WRITE SAME support
dm: remove map_info
dm snapshot: do not use map_context
dm thin: dont use map_context
dm raid1: dont use map_context
dm flakey: dont use map_context
dm raid1: rename read_record to bio_record
dm: move target request nr to dm_target_io
dm snapshot: use per_bio_data
dm verity: use per_bio_data
dm raid1: use per_bio_data
dm: introduce per_bio_data
dm kcopyd: add WRITE SAME support to dm_kcopyd_zero
dm linear: add WRITE SAME support
dm: add WRITE SAME support
dm: prepare to support WRITE SAME
dm ioctl: use kmalloc if possible
...
Conflicts:
MAINTAINERS
|
|
Bf60x soc has a new PPI called Enhanced PPI version 3.
HD video is supported now. To achieve this, we redesign
ppi params and add dv timings feature.
Signed-off-by: Scott Jiang <scott.jiang.linux@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Other drivers can make use of it.
Signed-off-by: Scott Jiang <scott.jiang.linux@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
I was testing my video-over-ethernet subsystem recently, and vivi
seemed to be perfect video source for testing when one don't have lots
of capture boards and cameras. Only its framerate was hardcoded to
NTSC's 30fps, while in my country we usually use PAL (25 fps) and I
needed that to precisely simulate bandwidth.
That's why here is this patch with ->enum_frameintervals() and
->{g,s}_parm() implemented as suggested by Hans Verkuil which passes
v4l2-compliance and manual testing through v4l2-ctl -P / -p <fps>.
Regarding newly introduced __get_format(u32 pixelformat) I decided not
to convert original get_format() to operate on fourcc codes, since >= 3
places in driver need to deal with v4l2_format and otherwise it won't be
handy.
[mchehab@redhat.com: Some CodingStyle fixes]
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
devm_gpio_request is a device managed function and will make
error handling and cleanup a bit simpler.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
precalculate_line() is not very high on profile, but it calls expensive
gen_twopix(), so let's polish it too:
call gen_twopix() only once for every color bar and then distribute
the result.
before:
# cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
#
# Samples: 46K of event 'cycles'
# Event count (approx.): 15574200568
#
# Overhead Command Shared Object
# ........ ............... ....................
#
27.99% rawv libc-2.13.so [.] __memcpy_ssse3
23.29% vivi-* [kernel.kallsyms] [k] memcpy
10.30% Xorg [unknown] [.] 0xa75c98f8
5.34% vivi-* [vivi] [k] gen_text.constprop.6
4.61% rawv [vivi] [k] gen_twopix
2.64% rawv [vivi] [k] precalculate_line
1.37% swapper [kernel.kallsyms] [k] read_hpet
after:
# cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
#
# Samples: 45K of event 'cycles'
# Event count (approx.): 15561769214
#
# Overhead Command Shared Object
# ........ ............... ....................
#
30.73% rawv libc-2.13.so [.] __memcpy_ssse3
26.78% vivi-* [kernel.kallsyms] [k] memcpy
10.68% Xorg [unknown] [.] 0xa73015e9
5.55% vivi-* [vivi] [k] gen_text.constprop.6
1.36% swapper [kernel.kallsyms] [k] read_hpet
0.96% Xorg [kernel.kallsyms] [k] read_hpet
...
0.16% rawv [vivi] [k] precalculate_line
...
0.14% rawv [vivi] [k] gen_twopix
(i.e. gen_twopix and precalculate_line overheads are almost gone)
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
The "dev->mvcount % wmax" thing was showing high in profiles (we do it
for each line which ~ 500 per frame)
? 000010c0 <vivi_fillbuff>:
...
0,39 ? 70:???mov 0x3ff4(%edi),%esi
0,22 ? 76:? mov 0x2a0(%edi),%eax
0,30 ? ? mov -0x84(%ebp),%ebx
0,35 ? ? mov %eax,%edx
0,04 ? ? mov -0x7c(%ebp),%ecx
0,35 ? ? sar $0x1f,%edx
0,44 ? ? idivl -0x7c(%ebp)
21,68 ? ? imul %esi,%ecx
0,70 ? ? imul %esi,%ebx
0,52 ? ? add -0x88(%ebp),%ebx
1,65 ? ? mov %ebx,%eax
0,22 ? ? imul %edx,%esi
0,04 ? ? lea 0x3f4(%edi,%esi,1),%edx
2,18 ? ?? call vivi_fillbuff+0xa6
0,74 ? ? addl $0x1,-0x80(%ebp)
62,69 ? ? mov -0x7c(%ebp),%edx
1,18 ? ? mov -0x80(%ebp),%ecx
0,35 ? ? add %edx,-0x84(%ebp)
0,61 ? ? cmp %ecx,-0x8c(%ebp)
0,22 ? ???jne 70
so since all variables stay the same for all iterations let's move
computations out of the loop: the abovementioned division and
"width*pixelsize" too
before:
# cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
#
# Samples: 49K of event 'cycles'
# Event count (approx.): 16475832370
#
# Overhead Command Shared Object
# ........ ............... ......................
#
29.07% rawv libc-2.13.so [.] __memcpy_ssse3
20.57% vivi-* [kernel.kallsyms] [k] memcpy
10.20% Xorg [unknown] [.] 0xa7301494
5.16% vivi-* [vivi] [k] gen_text.constprop.6
4.43% rawv [vivi] [k] gen_twopix
4.36% vivi-* [vivi] [k] vivi_fillbuff
2.42% rawv [vivi] [k] precalculate_line
1.33% swapper [kernel.kallsyms] [k] read_hpet
after:
# cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
#
# Samples: 46K of event 'cycles'
# Event count (approx.): 15574200568
#
# Overhead Command Shared Object
# ........ ............... ....................
#
27.99% rawv libc-2.13.so [.] __memcpy_ssse3
23.29% vivi-* [kernel.kallsyms] [k] memcpy
10.30% Xorg [unknown] [.] 0xa75c98f8
5.34% vivi-* [vivi] [k] gen_text.constprop.6
4.61% rawv [vivi] [k] gen_twopix
2.64% rawv [vivi] [k] precalculate_line
1.37% swapper [kernel.kallsyms] [k] read_hpet
0.79% Xorg [kernel.kallsyms] [k] read_hpet
0.64% Xorg [kernel.kallsyms] [k] unix_poll
0.45% Xorg [kernel.kallsyms] [k] fget_light
0.43% rawv libxcb.so.1.1.0 [.] 0x0000aae9
0.40% runsv [kernel.kallsyms] [k] ext2_try_to_allocate
0.36% Xorg [kernel.kallsyms] [k] _raw_spin_lock_irqsave
0.31% vivi-* [vivi] [k] vivi_fillbuff
(i.e. vivi_fillbuff own overhead is almost gone)
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Though dev->line[] is u8 array we work with it as with u16, u24 or u32
pixels, and also pass it to memcpy() and it's better to align it to at
least 4.
Before the patch, on x86 offsetof(vivi_dev, line) was 1003 and after
patch it is 1004.
There is slight performance increase, but I think is is slight, only
because we start copying not from line[0]:
---- 8< ---- drivers/media/platform/vivi.c
static void vivi_fillbuff(struct vivi_dev *dev, struct vivi_buffer *buf)
{
...
for (h = 0; h < hmax; h++)
memcpy(vbuf + h * wmax * dev->pixelsize,
dev->line + (dev->mv_count % wmax) * dev->pixelsize,
wmax * dev->pixelsize);
before:
# cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
#
# Samples: 49K of event 'cycles'
# Event count (approx.): 16799780016
#
# Overhead Command Shared Object
# ........ ............... ....................
#
27.51% rawv libc-2.13.so [.] __memcpy_ssse3
23.77% vivi-* [kernel.kallsyms] [k] memcpy
9.96% Xorg [unknown] [.] 0xa76f5e12
4.94% vivi-* [vivi] [k] gen_text.constprop.6
4.44% rawv [vivi] [k] gen_twopix
3.17% vivi-* [vivi] [k] vivi_fillbuff
2.45% rawv [vivi] [k] precalculate_line
1.20% swapper [kernel.kallsyms] [k] read_hpet
23.77% vivi-* [kernel.kallsyms] [k] memcpy
|
--- memcpy
|
|--99.28%-- vivi_fillbuff
| vivi_thread
| kthread
| ret_from_kernel_thread
--0.72%-- [...]
after:
# cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
#
# Samples: 49K of event 'cycles'
# Event count (approx.): 16475832370
#
# Overhead Command Shared Object
# ........ ............... ......................
#
29.07% rawv libc-2.13.so [.] __memcpy_ssse3
20.57% vivi-* [kernel.kallsyms] [k] memcpy
10.20% Xorg [unknown] [.] 0xa7301494
5.16% vivi-* [vivi] [k] gen_text.constprop.6
4.43% rawv [vivi] [k] gen_twopix
4.36% vivi-* [vivi] [k] vivi_fillbuff
2.42% rawv [vivi] [k] precalculate_line
1.33% swapper [kernel.kallsyms] [k] read_hpet
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
I've noticed that vivi takes a lot of CPU to produce its frames.
For example for 8 devices and 8 simple programs running, where each
captures YUY2 640x480 and displays it to X via SDL, profile timing is as
follows:
# cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
# Samples: 82K of event 'cycles'
# Event count (approx.): 31551930117
#
# Overhead Command Shared Object Symbol
# ........ ............... ....................
#
49.48% vivi-* [vivi] [k] gen_twopix
10.79% vivi-* [kernel.kallsyms] [k] memcpy
10.02% rawv libc-2.13.so [.] __memcpy_ssse3
8.35% vivi-* [vivi] [k] gen_text.constprop.6
5.06% Xorg [unknown] [.] 0xa73015f8
2.32% rawv [vivi] [k] gen_twopix
1.22% rawv [vivi] [k] precalculate_line
1.20% vivi-* [vivi] [k] vivi_fillbuff
(rawv is display program, vivi-* is a combination of vivi-000 through vivi-007)
so a lot of time is spent in gen_twopix() which as the follwing
call-graph profile shows ...
49.48% vivi-* [vivi] [k] gen_twopix
|
--- gen_twopix
|
|--96.30%-- gen_text.constprop.6
| vivi_fillbuff
| vivi_thread
| kthread
| ret_from_kernel_thread
|
--3.70%-- vivi_fillbuff
vivi_thread
kthread
ret_from_kernel_thread
... is called mostly from gen_text().
If we'll look at gen_text(), in the inner loop, we'll see
if (chr & (1 << (7 - i)))
gen_twopix(dev, pos + j * dev->pixelsize, WHITE, (x+y) & 1);
else
gen_twopix(dev, pos + j * dev->pixelsize, TEXT_BLACK, (x+y) & 1);
which calls gen_twopix() for every character pixel, and that is very
expensive, because gen_twopix() branches several times.
Now, let's note, that we operate on only two colors - WHITE and
TEXT_BLACK, and that pixel for that colors could be precomputed and
gen_twopix() moved out of the inner loop. Also note, that for black
and white colors even/odd does not make a difference for all supported
pixel formats, so we could stop doing that `odd` gen_twopix() parameter
game.
So the first thing we are doing here is
1) moving gen_twopix() calls out of gen_text() into vivi_fillbuff(),
to pregenerate black and white colors, just before printing
starts.
what we have next is that gen_text's font rendering loop, even with
gen_twopix() calls moved out, was inefficient and branchy, so let's
2) rewrite gen_text() loop so it uses less variables + unroll char
horizontal-rendering loop + instantiate 3 code paths for pixelsizes 2,3
and 4 so that in all inner loops we don't have to branch or make
indirections (*).
Done all above reworks, for gen_text() we get nice, non-branchy
streamlined code (showing loop for pixelsize=2):
? cmp $0x2,%eax
? ? jne 26
? mov -0x18(%ebp),%eax
? mov -0x20(%ebp),%edi
? imul -0x20(%ebp),%eax
? movzwl 0x3ffc(%ebx),%esi
0,08 ? movzwl 0x4000(%ebx),%ecx
0,04 ? add %edi,%edi
? mov 0x0,%ebx
0,51 ? mov %edi,-0x1c(%ebp)
? mov %ebx,-0x14(%ebp)
? movl $0x0,-0x10(%ebp)
? lea 0x20(%edx,%eax,2),%eax
? mov %eax,-0x18(%ebp)
? xchg %ax,%ax
0,04 ? a0: mov 0x8(%ebp),%ebx
? mov -0x18(%ebp),%eax
0,04 ? movzbl (%ebx),%edx
0,16 ? test %dl,%dl
0,04 ? ? je 128
0,08 ? lea 0x0(%esi),%esi
1,61 ? b0:???shl $0x4,%edx
1,02 ? ? mov -0x14(%ebp),%edi
2,04 ? ? add -0x10(%ebp),%edx
2,24 ? ? lea 0x1(%ebx),%ebx
0,27 ? ? movzbl (%edi,%edx,1),%edx
9,92 ? ? mov %esi,%edi
0,39 ? ? test %dl,%dl
2,04 ? ? cmovns %ecx,%edi
4,63 ? ? test $0x40,%dl
0,55 ? ? mov %di,(%eax)
3,76 ? ? mov %esi,%edi
0,71 ? ? cmove %ecx,%edi
3,41 ? ? test $0x20,%dl
0,75 ? ? mov %di,0x2(%eax)
2,43 ? ? mov %esi,%edi
0,59 ? ? cmove %ecx,%edi
4,59 ? ? test $0x10,%dl
0,67 ? ? mov %di,0x4(%eax)
2,55 ? ? mov %esi,%edi
0,78 ? ? cmove %ecx,%edi
4,31 ? ? test $0x8,%dl
0,67 ? ? mov %di,0x6(%eax)
5,76 ? ? mov %esi,%edi
1,80 ? ? cmove %ecx,%edi
4,20 ? ? test $0x4,%dl
0,86 ? ? mov %di,0x8(%eax)
2,98 ? ? mov %esi,%edi
1,37 ? ? cmove %ecx,%edi
4,67 ? ? test $0x2,%dl
0,20 ? ? mov %di,0xa(%eax)
2,78 ? ? mov %esi,%edi
0,75 ? ? cmove %ecx,%edi
3,92 ? ? and $0x1,%edx
0,75 ? ? mov %esi,%edx
2,59 ? ? mov %di,0xc(%eax)
0,59 ? ? cmove %ecx,%edx
3,10 ? ? mov %dx,0xe(%eax)
2,39 ? ? add $0x10,%eax
0,51 ? ? movzbl (%ebx),%edx
2,86 ? ? test %dl,%dl
2,31 ? ???jne b0
0,04 ?128: addl $0x1,-0x10(%ebp)
4,00 ? mov -0x1c(%ebp),%eax
0,04 ? add %eax,-0x18(%ebp)
0,08 ? cmpl $0x10,-0x10(%ebp)
? ? jne a0
which almost goes away from the profile:
# cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
# Samples: 49K of event 'cycles'
# Event count (approx.): 16799780016
#
# Overhead Command Shared Object Symbol
# ........ ............... ....................
#
27.51% rawv libc-2.13.so [.] __memcpy_ssse3
23.77% vivi-* [kernel.kallsyms] [k] memcpy
9.96% Xorg [unknown] [.] 0xa76f5e12
4.94% vivi-* [vivi] [k] gen_text.constprop.6
4.44% rawv [vivi] [k] gen_twopix
3.17% vivi-* [vivi] [k] vivi_fillbuff
2.45% rawv [vivi] [k] precalculate_line
1.20% swapper [kernel.kallsyms] [k] read_hpet
i.e. gen_twopix() overhead dropped from 49% to 4% and gen_text() loops
from ~8% to ~4%, and overal cycles count dropped from 31551930117 to
16799780016 which is ~1.9x whole workload speedup.
(*) for RGB24 rendering I've introduced x24, which could be thought as
synthetic u24 for simplifying the code. That's done because for
memcpy used for conditional assignment, gcc generates suboptimal code
with more indirections.
Fortunately, in C struct assignment is builtin and that's all we
need from pixeltype for font rendering.
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|