spi_master.rst 21 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429
  1. SPI Master driver
  2. =================
  3. Overview
  4. --------
  5. The ESP32 has four SPI peripheral devices, called SPI0, SPI1, HSPI and VSPI. SPI0 is entirely dedicated to
  6. the flash cache the ESP32 uses to map the SPI flash device it is connected to into memory. SPI1 is
  7. connected to the same hardware lines as SPI0 and is used to write to the flash chip. HSPI and VSPI
  8. are free to use. SPI1, HSPI and VSPI all have three chip select lines, allowing them to drive up to
  9. three SPI devices each as a master.
  10. The spi_master driver
  11. ^^^^^^^^^^^^^^^^^^^^^
  12. The spi_master driver allows easy communicating with SPI slave devices, even in a multithreaded environment.
  13. It fully transparently handles DMA transfers to read and write data and automatically takes care of
  14. multiplexing between different SPI slaves on the same master
  15. Terminology
  16. ^^^^^^^^^^^
  17. The spi_master driver uses the following terms:
  18. * Host: The SPI peripheral inside the ESP32 initiating the SPI transmissions. One of SPI, HSPI or VSPI. (For
  19. now, only HSPI or VSPI are actually supported in the driver; it will support all 3 peripherals
  20. somewhere in the future.)
  21. * Bus: The SPI bus, common to all SPI devices connected to one host. In general the bus consists of the
  22. miso, mosi, sclk and optionally quadwp and quadhd signals. The SPI slaves are connected to these
  23. signals in parallel.
  24. - miso - Also known as q, this is the input of the serial stream into the ESP32
  25. - mosi - Also known as d, this is the output of the serial stream from the ESP32
  26. - sclk - Clock signal. Each data bit is clocked out or in on the positive or negative edge of this signal
  27. - quadwp - Write Protect signal. Only used for 4-bit (qio/qout) transactions.
  28. - quadhd - Hold signal. Only used for 4-bit (qio/qout) transactions.
  29. * Device: A SPI slave. Each SPI slave has its own chip select (CS) line, which is made active when
  30. a transmission to/from the SPI slave occurs.
  31. * Transaction: One instance of CS going active, data transfer from and/or to a device happening, and
  32. CS going inactive again. Transactions are atomic, as in they will never be interrupted by another
  33. transaction.
  34. SPI transactions
  35. ^^^^^^^^^^^^^^^^
  36. A transaction on the SPI bus consists of five phases, any of which may be skipped:
  37. * The command phase. In this phase, a command (0-16 bit) is clocked out.
  38. * The address phase. In this phase, an address (0-64 bit) is clocked out.
  39. * The write phase. The master sends data to the slave.
  40. * The dummy phase. The phase is configurable, used to meet the timing requirements.
  41. * The read phase. The slave sends data to the master.
  42. In full duplex mode, the read and write phases are combined, and the SPI host reads and
  43. writes data simultaneously. The total transaction length is decided by
  44. ``command_bits + address_bits + trans_conf.length``, while the ``trans_conf.rx_length``
  45. only determins length of data received into the buffer.
  46. While in half duplex mode, the host have independent write and read phases. The length of write phase and read phase are
  47. decided by ``trans_conf.length`` and ``trans_conf.rx_length`` respectively.
  48. The command and address phase are optional in that not every SPI device will need to be sent a command
  49. and/or address. This is reflected in the device configuration: when the ``command_bits`` or ``address_bits``
  50. fields are set to zero, no command or address phase is done.
  51. Something similar is true for the read and write phase: not every transaction needs both data to be written
  52. as well as data to be read. When ``rx_buffer`` is NULL (and SPI_USE_RXDATA) is not set) the read phase
  53. is skipped. When ``tx_buffer`` is NULL (and SPI_USE_TXDATA) is not set) the write phase is skipped.
  54. GPIO matrix and IOMUX
  55. ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  56. Most peripheral signals in ESP32 can connect directly to a specific GPIO, which is called its IOMUX pin. When a
  57. peripheral signal is routed to a pin other than its IOMUX pin, ESP32 uses the less direct GPIO matrix to make this
  58. connection.
  59. If the driver is configured with all SPI signals set to their specific IOMUX pins (or left unconnected), it will bypass
  60. the GPIO matrix. If any SPI signal is configured to a pin other than its IOMUx pin, the driver will automatically route
  61. all the signals via the GPIO Matrix. The GPIO matrix samples all signals at 80MHz and sends them between the GPIO and
  62. the peripheral.
  63. When the GPIO matrix is used, signals faster than 40MHz cannot propagate and the setup time of MISO is more easily
  64. violated, since the input delay of MISO signal is increased. The maximum clock frequency with GPIO Matrix is 40MHz
  65. or less, whereas using all IOMUX pins allows 80MHz.
  66. .. note:: More details about influence of input delay on the maximum clock frequency, see :ref:`timing_considerations` below.
  67. IOMUX pins for SPI controllers are as below:
  68. +----------+------+------+
  69. | Pin Name | HSPI | VSPI |
  70. + +------+------+
  71. | | GPIO Number |
  72. +==========+======+======+
  73. | CS0* | 15 | 5 |
  74. +----------+------+------+
  75. | SCLK | 14 | 18 |
  76. +----------+------+------+
  77. | MISO | 12 | 19 |
  78. +----------+------+------+
  79. | MOSI | 13 | 23 |
  80. +----------+------+------+
  81. | QUADWP | 2 | 22 |
  82. +----------+------+------+
  83. | QUADHD | 4 | 21 |
  84. +----------+------+------+
  85. note * Only the first device attaching to the bus can use CS0 pin.
  86. Using the spi_master driver
  87. ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  88. - Initialize a SPI bus by calling ``spi_bus_initialize``. Make sure to set the correct IO pins in
  89. the ``bus_config`` struct. Take care to set signals that are not needed to -1.
  90. - Tell the driver about a SPI slave device connected to the bus by calling spi_bus_add_device.
  91. Make sure to configure any timing requirements the device has in the ``dev_config`` structure.
  92. You should now have a handle for the device, to be used when sending it a transaction.
  93. - To interact with the device, fill one or more spi_transaction_t structure with any transaction
  94. parameters you need. Either queue all transactions by calling ``spi_device_queue_trans``, later
  95. quering the result using ``spi_device_get_trans_result``, or handle all requests synchroneously
  96. by feeding them into ``spi_device_transmit``.
  97. - Optional: to unload the driver for a device, call ``spi_bus_remove_device`` with the device
  98. handle as an argument
  99. - Optional: to remove the driver for a bus, make sure no more drivers are attached and call
  100. ``spi_bus_free``.
  101. Command and address phases
  102. ^^^^^^^^^^^^^^^^^^^^^^^^^^
  103. During the command and address phases, ``cmd`` and ``addr`` field in the
  104. ``spi_transaction_t`` struct are sent to the bus, while nothing is read at the
  105. same time. The default length of command and address phase are set in the
  106. ``spi_device_interface_config_t`` and by ``spi_bus_add_device``. When the the
  107. flag ``SPI_TRANS_VARIABLE_CMD`` and ``SPI_TRANS_VARIABLE_ADDR`` are not set in
  108. the ``spi_transaction_t``,the driver automatically set the length of these
  109. phases to the default value as set when the device is initialized respectively.
  110. If the length of command and address phases needs to be variable, declare a
  111. ``spi_transaction_ext_t`` descriptor, set the flag ``SPI_TRANS_VARIABLE_CMD``
  112. or/and ``SPI_TRANS_VARIABLE_ADDR`` in the ``flags`` of ``base`` member and
  113. configure the rest part of ``base`` as usual. Then the length of each phases
  114. will be ``command_bits`` and ``address_bits`` set in the ``spi_transaction_ext_t``.
  115. Write and read phases
  116. ^^^^^^^^^^^^^^^^^^^^^
  117. Normally, data to be transferred to or from a device will be read from or written to a chunk of memory
  118. indicated by the ``rx_buffer`` and ``tx_buffer`` members of the transaction structure.
  119. When DMA is enabled for transfers, these buffers are highly recommended to meet the requirements as below:
  120. 1. allocated in DMA-capable memory using ``pvPortMallocCaps(size, MALLOC_CAP_DMA)``;
  121. 2. 32-bit aligned (start from the boundary and have length of multiples of 4 bytes).
  122. If these requirements are not satisfied, efficiency of the transaction will suffer due to the allocation and
  123. memcpy of temporary buffers.
  124. .. note:: Half duplex transactions with both read and write phases are not supported when using DMA. See
  125. :ref:`spi_known_issues` for details and workarounds.
  126. Tips
  127. """"
  128. 1. Transactions with small amount of data:
  129. Sometimes, the amount of data is very small making it less than optimal allocating a separate buffer
  130. for it. If the data to be transferred is 32 bits or less, it can be stored in the transaction struct
  131. itself. For transmitted data, use the ``tx_data`` member for this and set the ``SPI_USE_TXDATA`` flag
  132. on the transmission. For received data, use ``rx_data`` and set ``SPI_USE_RXDATA``. In both cases, do
  133. not touch the ``tx_buffer`` or ``rx_buffer`` members, because they use the same memory locations
  134. as ``tx_data`` and ``rx_data``.
  135. 2. Transactions with integers other than uint8_t
  136. The SPI peripheral reads and writes the memory byte-by-byte. By default,
  137. the SPI works at MSB first mode, each bytes are sent or received from the
  138. MSB to the LSB. However, if you want to send data with length which is
  139. not multiples of 8 bits, unused bits are sent.
  140. E.g. you write ``uint8_t data = 0x15`` (00010101B), and set length to
  141. only 5 bits, the sent data is ``00010B`` rather than expected ``10101B``.
  142. Moreover, ESP32 is a little-endian chip whose lowest byte is stored at
  143. the very beginning address for uint16_t and uint32_t variables. Hence if
  144. a uint16_t is stored in the memory, it's bit 7 is first sent, then bit 6
  145. to 0, then comes its bit 15 to bit 8.
  146. To send data other than uint8_t arrays, macros ``SPI_SWAP_DATA_TX`` is
  147. provided to shift your data to the MSB and swap the MSB to the lowest
  148. address; while ``SPI_SWAP_DATA_RX`` can be used to swap received data
  149. from the MSB to it's correct place.
  150. Speed and Timing Considerations
  151. -------------------------------
  152. Transferring speed
  153. ^^^^^^^^^^^^^^^^^^
  154. There're two factors limiting the transferring speed: (1) The transaction interval, (2) The SPI clock frequency used.
  155. When large transactions are used, the clock frequency determines the transferring speed; while the interval effects the
  156. speed a lot if small transactions are used.
  157. 1. Transaction interval: The interval mainly comes from the cost of FreeRTOS queues and the time switching between
  158. tasks and the ISR. It also takes time for the software to setup spi peripheral registers as well as copy data to
  159. FIFOs, or setup DMA links. Depending on whether the DMA is used, the interval of an one-byte transaction is around
  160. 25us typically.
  161. 1. The CPU is blocked and switched to other tasks when the
  162. transaction is in flight. This save the cpu time but increase the interval.
  163. 2. When the DMA is enabled, it needs about 2us per transaction to setup the linked list. When the master is
  164. transferring, it automatically read data from the linked list. If the DMA is not enabled,
  165. CPU has to write/read each byte to/from the FIFO by itself. Usually this is faster than 2us, but the
  166. transaction length is limited to 32 bytes for both write and read.
  167. Typical transaction interval with one byte data is as below:
  168. +--------+------------------+
  169. | | Transaction Time |
  170. +========+==================+
  171. | | Typical (us) |
  172. +--------+------------------+
  173. | DMA | 24 |
  174. +--------+------------------+
  175. | No DMA | 22 |
  176. +--------+------------------+
  177. 2. SPI clock frequency: Each byte transferred takes 8 times of the clock period *8/fspi*. If the clock frequency is
  178. too high, some functions may be limited to use. See :ref:`timing_considerations`.
  179. For a normal transaction, the overall cost is *20+8n/Fspi[MHz]* [us] for n bytes tranferred
  180. in one transaction. Hence the transferring speed is : *n/(20+8n/Fspi)*. Example of transferring speed under 8MHz
  181. clock speed:
  182. +-----------+----------------------+--------------------+------------+-------------+
  183. | Frequency | Transaction Interval | Transaction Length | Total Time | Total Speed |
  184. | | | | | |
  185. | (MHz) | (us) | (bytes) | (us) | (kBps) |
  186. +===========+======================+====================+============+=============+
  187. | 8 | 25 | 1 | 26 | 38.5 |
  188. +-----------+----------------------+--------------------+------------+-------------+
  189. | 8 | 25 | 8 | 33 | 242.4 |
  190. +-----------+----------------------+--------------------+------------+-------------+
  191. | 8 | 25 | 16 | 41 | 490.2 |
  192. +-----------+----------------------+--------------------+------------+-------------+
  193. | 8 | 25 | 64 | 89 | 719.1 |
  194. +-----------+----------------------+--------------------+------------+-------------+
  195. | 8 | 25 | 128 | 153 | 836.6 |
  196. +-----------+----------------------+--------------------+------------+-------------+
  197. When the length of transaction is short, the cost of transaction interval is really high. Please try to squash data
  198. into one transaction if possible to get higher transfer speed.
  199. .. _timing_considerations:
  200. Timing considerations
  201. ^^^^^^^^^^^^^^^^^^^^^
  202. As shown in the figure below, there is a delay on the MISO signal after SCLK
  203. launch edge and before it's latched by the internal register. As a result,
  204. the MISO pin setup time is the limiting factor for SPI clock speed. When the
  205. delay is too large, setup slack is < 0 and the setup timing requirement is
  206. violated, leads to the failure of reading correctly.
  207. .. image:: /../_static/spi_miso.png
  208. .. wavedrom don't support rendering pdflatex till now(1.3.1), so we use the png here
  209. .. image:: /../_static/miso_timing_waveform.png
  210. The maximum frequency allowed is related to the *input delay* (maximum valid
  211. time after SCLK on the MISO bus), as well as the usage of GPIO matrix. The
  212. maximum frequency allowed is reduced to about 33~77% (related to existing
  213. *input delay*) when the GPIO matrix is used. To work at higher frequency, you
  214. have to use the IOMUX pins or the *dummy bit workaround*. You can get the
  215. maximum reading frequency of the master by ``spi_get_freq_limit``.
  216. .. _dummy_bit_workaround:
  217. **Dummy bit workaround:** We can insert dummy clocks (during which the host does not read data) before the read phase
  218. actually begins. The slave still sees the dummy clocks and gives out data, but the host does not read until the read
  219. phase. This compensates the lack of setup time of MISO required by the host, allowing the host reading at higher
  220. frequency.
  221. In the ideal case (the slave is so fast that the input delay is shorter than an apb clock, 12.5ns), the maximum
  222. frequency host can read (or read and write) under different conditions is as below:
  223. +-------------+-------------+------------+-----------------------------+
  224. | Frequency Limit (MHz) | Dummy Bits | Comments |
  225. +-------------+-------------+ Used + +
  226. | GPIO matrix | IOMUX pins | By Driver | |
  227. +=============+=============+============+=============================+
  228. | 26.6 | 80 | No | |
  229. +-------------+-------------+------------+-----------------------------+
  230. | 40 | -- | Yes | Half Duplex, no DMA allowed |
  231. +-------------+-------------+------------+-----------------------------+
  232. And if the host only writes, the *dummy bit workaround* is not used and the frequency limit is as below:
  233. +-------------------+------------------+
  234. | GPIO matrix (MHz) | IOMUX pins (MHz) |
  235. +===================+==================+
  236. | 40 | 80 |
  237. +-------------------+------------------+
  238. The spi master driver can work even if the *input delay* in the ``spi_device_interface_config_t`` is set to 0.
  239. However, setting a accurate value helps to: (1) calculate the frequency limit in full duplex mode, and (2) compensate
  240. the timing correctly by dummy bits in half duplex mode. You may find the maximum data valid time after the launch edge
  241. of SPI clocks in the AC characteristics chapter of the device specifications, or measure the time on a oscilloscope or
  242. logic analyzer.
  243. .. wavedrom don't support rendering pdflatex till now(1.3.1), so we use the png here
  244. .. image:: /../_static/miso_timing_waveform_async.png
  245. As shown in the figure above, the input delay is usually:
  246. *[input delay] = [sample delay] + [slave output delay]*
  247. 1. The sample delay is the maximum random delay due to the
  248. asynchronization of SCLK and peripheral clock of the slave. It's usually
  249. 1 slave peripheral clock if the clock is asynchronize with SCLK, or 0 if
  250. the slave just use the SCLK to latch the SCLK and launch MISO data. e.g.
  251. for ESP32 slaves, the delay is 12.5ns (1 apb clock), while it is reduced
  252. to 0 if the slave is in the same chip as the master.
  253. 2. The slave output delay is the time for the MOSI to be stable after the
  254. launch edge. e.g. for ESP32 slaves, the output delay is 37.5ns (3 apb
  255. clocks) when IOMUX pins in the slave is used, or 62.5ns (5 apb clocks) if
  256. through the GPIO matrix.
  257. Some typical delays are shown in the following table:
  258. +--------------------+------------------+
  259. | Device | Input delay (ns) |
  260. +====================+==================+
  261. | Ideal device | 0 |
  262. +--------------------+------------------+
  263. | ESP32 slave IOMUX* | 50 |
  264. +--------------------+------------------+
  265. | ESP32 slave GPIO* | 75 |
  266. +--------------------+------------------+
  267. | ESP32 slave is on an independent |
  268. | chip, 12.5ns sample delay included. |
  269. +---------------------------------------+
  270. The MISO path delay(tv), consists of slave *input delay* and master *GPIO matrix delay*, finally determines the
  271. frequency limit, above which the full duplex mode will not work, or dummy bits are used in the half duplex mode. The
  272. frequency limit is:
  273. *Freq limit[MHz] = 80 / (floor(MISO delay[ns]/12.5) + 1)*
  274. The figure below shows the relations of frequency limit against the input delay. 2 extra apb clocks should be counted
  275. into the MISO delay if the GPIO matrix in the master is used.
  276. .. image:: /../_static/spi_master_freq_tv.png
  277. Corresponding frequency limit for different devices with different *input delay* are shown in the following
  278. table:
  279. +--------+------------------+----------------------+-------------------+
  280. | Master | Input delay (ns) | MISO path delay (ns) | Freq. limit (MHz) |
  281. +========+==================+======================+===================+
  282. | IOMUX | 0 | 0 | 80 |
  283. + (0ns) +------------------+----------------------+-------------------+
  284. | | 50 | 50 | 16 |
  285. + +------------------+----------------------+-------------------+
  286. | | 75 | 75 | 11.43 |
  287. +--------+------------------+----------------------+-------------------+
  288. | GPIO | 0 | 25 | 26.67 |
  289. + (25ns) +------------------+----------------------+-------------------+
  290. | | 50 | 75 | 11.43 |
  291. + +------------------+----------------------+-------------------+
  292. | | 75 | 100 | 8.89 |
  293. +--------+------------------+----------------------+-------------------+
  294. Thread Safety
  295. -------------
  296. The SPI driver API is thread safe when multiple SPI devices on the same bus are accessed from different tasks. However, the driver is not thread safe if the same SPI device is accessed from multiple tasks.
  297. In this case, it is recommended to either refactor your application so only a single task accesses each SPI device, or to add mutex locking around access of the shared device.
  298. .. _spi_known_issues:
  299. Known Issues
  300. ------------
  301. 1. Half duplex mode is not compatible with DMA when both writing and reading phases exist.
  302. If such transactions are required, you have to use one of the alternative solutions:
  303. 1. use full-duplex mode instead.
  304. 2. disable the DMA by setting the last parameter to 0 in bus initialization function just as below:
  305. ``ret=spi_bus_initialize(VSPI_HOST, &buscfg, 0);``
  306. this may prohibit you from transmitting and receiving data longer than 32 bytes.
  307. 3. try to use command and address field to replace the write phase.
  308. 2. Full duplex mode is not compatible with the *dummy bit workaround*, hence the frequency is limited. See :ref:`dummy
  309. bit speed-up workaround <dummy_bit_workaround>`.
  310. 3. ``cs_ena_pretrans`` is not compatible with command, address phases in full duplex mode.
  311. Application Example
  312. -------------------
  313. Display graphics on the 320x240 LCD of WROVER-Kits: :example:`peripherals/spi_master`.
  314. API Reference - SPI Common
  315. --------------------------
  316. .. include:: /_build/inc/spi_common.inc
  317. API Reference - SPI Master
  318. --------------------------
  319. .. include:: /_build/inc/spi_master.inc