Transcript for:
Exploring Harvard vs. Von Neumann Architecture

The Harvard Architecture is named after IBM's so-called Harvard Mark I, a computer that helped to develop the atomic bomb during World War II. In the Harvard Mark I, instructions were stored on punched paper tape, and data was stored in electromechanical counters within the CPU. The principle of storing instructions and data in different places became known as the Harvard Architecture.

Before we look at the Harvard Architecture, let's first consider a traditional single-core computer in which the programs and the data are both stored in the main memory, the RAM. This is known as the Von Neumann architecture. To fetch a program instruction or a data item into the CPU for processing, the relevant memory address is first sent to the main memory via the address bus. Then an instruction or a data item is sent back to the CPU via the data bus.

Saving a data item in the memory also requires that a memory address is sent to the memory via the address bus. Then the data is conveyed from the CPU to the memory via the data bus. Notice that memory addresses only ever travel in one direction, but data can travel both ways.

The fact that program instructions and data share the same memory and the same buses that connect the CPU to the memory means that they have to be fetched separately. An instruction to load a data item from the memory into the CPU, for example, will need two fetch-execute cycles to complete, one for the instruction and one for the data. This is known as the Von Neumann bottleneck. The Harvard architecture improves on this by storing instructions and data in separate memories.

And because these memories are separate, they don't need to be built in the same way. For example, depending on the application, there could be more instruction memory than data memory. This would require a wider address bus for the instruction memory.

Furthermore, the instruction memory might have a bigger word width. That is, each location in the instruction memory could be capable of storing more bits than a location in the data memory. Consequently, the instruction bus would be wider than the data bus. Depending on the application, it might also be appropriate to use read-only memory for the instructions, but read-write memory for the data. The Harvard architecture is most often found in digital signal processors.

So-called DSP applications include purpose-built hardware for audio and video processing. Medical imaging applications such as x-ray, MRI and CAT scans, fitness trackers and smartwatches, and digital assistants such as Amazon's Alexa and Google Home. What these DSP applications have in common is the capture and digitisation of analogue information, followed by some form of processing of the digital signal generated. For example, a speech processing application might digitise a spoken instruction, then filter out any interference or background noise before passing it on to an AI to be interpreted and actioned.

Even a modern personal computer borrows design principles from the Harvard architecture. A typical CPU has several cores, each with its own arithmetic and logic unit, and their operation is managed by a single control unit. It also contains multiple levels of cache memory. Cache memory temporarily stores frequently used instructions and data, rather than having to make multiple requests to the RAM.

Each core has its own Level 1 cache, which is closest to the core, meaning it can be accessed very quickly. Each core also has its own Level 2 cache, which is larger than the Level 1 cache, but a little slower to access. The Level 3 cache is shared by all of the cores.

It has the biggest capacity, but operates more slowly than the other two levels. Nevertheless, all of the CPU cache memory is much faster than the main memory, because there's no need for the instructions and data in the cache to travel along the data bus. Each level 1 cache is split in two, some of it for instructions and some of it for data. This is sometimes referred to as the modified Harvard architecture.

The main benefit of this approach, of course, is that data and instructions can be fetched and processed simultaneously. by the same core. So, as you've seen, the original design principle of the Harvard Mark I, that is, different memories for instructions and data, is still relevant today.