Opened 11 years ago
Last modified 9 years ago
#227 new defect
UFO hardware is ignoring lowest 12 bits of bus address
Reported by: | Suren A. Chilingaryan | Owned by: | Michele Caselle |
---|---|---|---|
Priority: | major | Milestone: | |
Component: | UFO Camera | Version: | |
Keywords: | Cc: | Matthias Balzer, Suren A. Chilingaryan, Timo Dritschler, Andreas Kopmann, Michele Caselle, Uros Stevanovic |
Description
I have localised the situations when the data corruption is observed at HEB hardware. We have a ring of DMA buffers. Each DMA buffer has virtual memory address (for software) and bus address (for hardware). When the bus address is multiple of 0x1000 everything works fine. For instance, 0x7c573000. If bus address of certain buffer is not multiple of 0x1000 the problems start. For instance, 0x7c510800.
So, I guess that the HEB hardware is ignoring lowest 12 bits of bus address. I.e. if the hardware needs to send data to 4K buffer with bus addresses from 0x7c510800 to 0x7c511800, it actually will write data at addresses 0x7c510000 to 0x7c511000. So, the first part of data will be written in undefined location in system memory and probably lost (or even may corrupt the earlier written data) and the last part of data
(0x200 bytes from 0x7c510800 to 0x7c511000) will be written to the correct buffer, but in the beginning instead of the end. So, for this reason you have the spaghetti data in the buffers.
Bellow is a little bit theory about how mapping is done and why we currently do not observe the same problem with UFO camera.
On 64 bit systems to enable DMA transfer the mechanism to convert 64 bit
hardware addresses to 32 bit PCI bus addresses is needed. Such mechanism
is called IOMMU and provided by Linux kernel to drivers transparently.
By default Linux kernel provides SWIOTLB, a software implementation of IOMMU based on bounce buffers. The special buffer is allocated in the lower 4GB of memory and used for all DMA transfers. The data then copied to actually supplied buffer in the high memory transparently. The usage of SWIOTLB may be detected by
dmesg | grep -A 2 SWIOTLB
[ 1.169614] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[ 1.169618] Placing 64MB software IO TLB between ffff8800bb766000 - ffff8800bf766000
[ 1.169620] software IO TLB at phys 0xbb766000 - 0xbf766000
Now, the limitations are the limited size of the buffer (64 MB by default, but may be increased with swiotlb=128M kernel parameter) and performance penalty due to copying of memory.
The alternative is the hardware IOTLB available in systems with Intel VT-d and AMD-Vi (AMD IOMMU) virtualization technologies. The VT-d should be enabled in BIOS and enabled with "intel_iommu=on" kernel parameter (alternative is to build kernel with CONFIG_INTEL_IOMMU_DEFAULT_ON). One may check
if hardware IOMMU is enabled with (check PCI-DMA line):
dmesg | grep -e IOMMU -e DMAR -e PCI-DMA
[ 0.000000] Intel-IOMMU: enabled
[ 0.124951] dmar: IOMMU 0: reg_base_addr fbffc000 ver 1:0 cap d2078c106f0466 ecap f020df
[ 0.125044] IOAPIC id 0 under DRHD base 0xfbffc000 IOMMU 0
[ 0.125045] IOAPIC id 2 under DRHD base 0xfbffc000 IOMMU 0
[ 0.836366] IOMMU 0 0xfbffc000: using Queued invalidation
[ 0.836370] IOMMU: Setting RMRR:
[ 0.836377] IOMMU: Setting identity map for device 0000:00:1d.0 [0x7ccd2000 - 0x7ccf6fff]
[ 0.836387] IOMMU: Setting identity map for device 0000:00:1a.0 [0x7ccd2000 - 0x7ccf6fff]
[ 0.836390] IOMMU: Prepare 0-16MiB unity mapping for LPC
[ 0.836394] IOMMU: Setting identity map for device 0000:00:1f.0 [0x0 - 0xffffff]
[ 0.836400] PCI-DMA: Intel(R) Virtualization Technology for Directed I/O
According to documentation this still has minor performance IOTLB cache is implemented as well. However, this method is preferable to software solution.
Now, the problems arise due to the way how the IOMMU maps hardware addresses to bus addresses. While the hardware addresses of pages allocated with _get_free_pages are guaranteed to be aligned to 4K boundaries. It is not
true for corresponding IOMMU bus addresses. For instance, I get from _get_free_pages 0xffff880618807000 and pci_map_single assigns it 0x7c510800 as a bus address. This fits PCIe specification, but may be unexpected by some in-house hardware. The situation with our hardware stations is following:
1) SWIOTLB is evolved with Linux kernel. Kernel 3.2 (SuSE 12.1) was always returned bus addresses aligned to 4K boundaries. Then, I guess, something was optimized to use bounce buffer more efficiently and with Kernel 3.4 (SuSE 12.2)
very rarely pages are mapped to unaligned addresses. More optimizations were done and with Kernel 3.11 (SuSE 13.1), the pages are quite often mapped to unaligned addresses.
2) The hardware IOMMU of Xeon E5-1620 running on Asus Z9PA-U8 motherboard always (at least with currently installed hardware) maps pages to addresses aligned to 4K boundary. However, I don't know how perform IOMMU implementations on AMD (or even other Intel platforms)
The original problem is resolved by introduction of 64-bit addressing as we don't need mapping any more and, hence, bounce buffers will never be used. However, it is still a nice feature to write non page aligned locations. So, I keep it open.